Cloud Storage Decision Making

by [Published on 12 Nov. 2015 / Last Updated on 12 Nov. 2015]

This article provides IT decision makers with guidance for using public cloud storage such as Microsoft Azure cloud storage services for your organization or business.

Introduction

Many organizations today are utilizing cloud service providers for storing sensitive information and business data because of the convenience and predictable costs associated with this approach. Many businesses have hesitated however and some who have already implemented cloud storage are now having second thoughts because of the growing security concerns surrounding cloud computing. It's important for organizations thinking about utilizing cloud storage services that they carefully consider both the benefits and risks of storing their data in the cloud. To examine the various issues involved, I've asked Greg Schulz an expert in cloud storage to share some of his thoughts with us here in my column on CloudComputingAdmin.com.

Greg Schulz is the Founder and Senior Analyst of an independent IT advisory consultancy firm called Server StorageIO and UnlimitedIO LLC (e.g. StorageIO). Greg has worked in IT for an electrical utility, financial services, and transportation firms in roles ranging from business applications development to systems management, architecture, strategy and capacity planning. He is also the author of the Intel Recommended Reading List books Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press) and Resilient Storage Networks (Elsevier). Greg is also a Microsoft Most Valuable Professional (MVP) and six-time VMware vExpert. You can learn more about Greg and his company here.

Using Microsoft Azure for cloud storage

Let's say that you have been tasked with, or decided that it is time to use (or try) public cloud storage such as Microsoft Azure. Ok, now what do you do and what decisions need to be made? Keep in mind that Microsoft Azure like many other popular public clouds provides many different services available for fee (subscription) along with free trials. These services include applications, compute, networking, storage along with development and management platform tools.

Microsoft Azure besides offering cloud storage also has services for:

  • Compute (where you can run your applications, containers or virtual machines)
  • Web and Mobile along with Developer and other tools
  • Analytics for little and big data along with Networking
  • Media and Content Distribution Networks (CDN)
  • Identity and Access Management (IAM) & Security

Microsoft Azure cloud storage services include:

  • Azure search and StoreSimple cloud gateway appliance
  • SQL Database and DocumentDB (NoSQL) and SQL Data Warehouse
  • Files (preview) -- Accessible within Azure cloud or via external REST API
  • Queues and Redis Cache and Tables
  • Storage Blobs (Block blobs, append blobs and page blobs)
  • Premium storage (preview) -- Non-blob block storage accessible within Azure cloud including HDD and SSD based storage for virtual machines

So which Microsoft Azure cloud storage service do you need or want to use?

There are many different options, however if you know what you need, then it's a simple matter of understanding what the options are and aligning those to your needs. Unless you already know which service that you need to use, let's take a look at your options to help with Azure storage decision making. Let's review what are your applications server, storage and I/O Performance, Availability, Capacity and Economic (PACE) characteristic requirements among others. Selecting cloud server, storage I/O resources along with application functionality is similar to selecting physical resources and traditional software. Keep in mind that you may require more than one type of data and storage services as everything is not the same including across, as well as within different applications.

Understanding your application and environment needs for cloud storage involves making decisions in the following areas:

  • Application focus: How will your application use data storage (activity and access)
  • Availability: How resilient and protected does your data storage need to be
  • Capacity: How much data do you need to store and access
  • Economics: How much will it cost up-front and recurring, what's your data value
  • Location matters: How far is your application from the data
  • Management: How will you management, protect, preserve and server data
  • Performance: How much performance do you need

Let's explore each of these different decision areas in more detail.

Application focus

How will your application use data and storage, for example, does it need access to an SQL Server (or NoSQL) database or data warehouse? Do you simply need storage to create a database, key value, an object or another repository on? Do you need storage space for placing active changing data, or data that is static, write once read many (WORM) and immutable? Do you need fast low-latency storage close to where your application resides or do you need lower cost, high capacity storage with good durability? How will your applications access and use the data and storage? For example via database SQL or ODBC, using XML, JSON, S3, Swift, CDMI among other APIs and programmatic bindings, using file shares such as NFS or CIFS/SMB or HDFS (Hadoop Distributed File System) or simply as a block or blob of storage? Or will your application need to access message queues and simple key value tables? Think of it this way, if you were adding new storage on-site, what would you be looking at doing? Buying a dedicated or shared block SAN type storage system, or an NAS for file and data sharing, object storage, database server or something else?

Availability

What level of availability do you need in terms of data or storage being accessible? How much down time can you or your application afford? Another consideration is durability. While accessibility refers to if you can get to and use your data or if the service is available, durable means how many copies in different locations including optional versions are available. This is important for enabling the 4 3 2 1 principal, rule, guide or policy for data protection which is that you should have 4 (or more) copies of your data, with at least 3 different versions stored in or on 2 or more different systems, servers or devices with at least one of them being off-site or away from where your application primarily resides. Other availability considerations include privacy along with security via encryption, identity access management (IAM), audit access controls, logs, and reporting. These all combine to support normal business resiliency (BR) as well as business continuance (BC), high-availability (HA) and disaster recovery (DR). The above and other considerations are important to help guide you to what level of resiliency, including having copies of your data in different Azure data centers or regions (geographic locations).

Capacity

How much space do you need now, and in the future? Does the space need to be active on-line, or near-line and how many copies? Also how will you initially move or import your data into Azure, via on-line or physically ship the data via a Hard Disk Drive (HDD)?

Economics

There is a common focus to shopping for storage based on cost per capacity, both for cloud and physical. While upfront costs that are often advertised along the lines of "as low as" per GByte per month pay attention to the fine print and details. For example, you can find cloud storage for a penny per GByte or less per month, however, what are the extra fees for when you need to use or access the data and storage? Likewise for a very low price, what will the performance, availability and other considerations be. Microsoft Azure like most other major and reputable cloud service providers does not hide or have secret fees. However, you do have to look at and understand what they document. For example, in exchange for a low cost per month, what are the access fees, bandwidth charges among other recurring costs? Also, understand what additional costs will be incurred for any additional hardware, software, management tools, gateways, import/export fees associated with using any cloud storage services.

Location matters

Where will or does your application reside, within Microsoft Azure running on a cloud server instance, or, on premise at your own (or somebody else's) site outside of the Azure cloud running on a physical, virtual or other cloud server instance? The reason location of your application matters is that like other cloud service providers such as Amazon Web Services (AWS), Google, HP, IBM/Softlayer, Rackspace, VMware vCloud among others, some storage services are restricted to use within their respective clouds, while others can also be accessed externally. For example, higher performance cloud storage services such as premium (HDD and SSD) storage is only available within and from Azure instances. This is similar to AWS Elastic Block Storage (EBS) among others. On the other hand, storage blobs (e.g. buckets and objects) are accessible internally, as well as externally to the Azure cloud similar to AWS Simple Storage Service (S3) buckets and objects, or Google Cloud Storage (GCS) containers and objects among others. Thus, the importance of knowing where your application currently runs and will be accessing storage is important for cloud storage decision making.

Management

How will you manage the data and storage? Will you simply use the Azure service portal, or via other means? If you are using SQL Server, will you also use Microsoft SQL Server Studio or Visual Studio among other tools for managing? It comes back to applications and how they will need to use and access the storage. Other considerations include how will you access, who will have access to the security keys for the various storage and data servers?

Performance

What level of performance do you or your applications need? This will depend on if you are looking to use Azure storage for bulk storage of data such as videos, images, photos, audio recordings or other large (and small) static data, or if you need storage with transaction ability to support many reads or writes. Location matters with performance, the closer your application is to the data, or, your data and storage are to the applications the better. For example, there are Azure premium storage services included with fast flash-based Solid State Device (SSD) as well as HDDs. However, those services need to be accessed from within Azure. Understand what your applications need for latency or response time, as well as in terms of messages processed, reads and updates, as well as IOPs, to help determine the applicability of different Azure services. Note that various Azure data and storage services have limits on performance activity in terms of bandwidth, IOPs, transactions, reads, writes, Gets, Puts and other operations, understand what they are vs. your application needs to design and decide appropriately.

Making decisions concerning cloud storage

Let's look at how to leverage the above information to navigate and make some Azure data and storage services decisions. The first table below shows various Azure data and storage services along with general tips on what to use when and where.

Azure

Similar to

Accessible   via

Use for

Blobs

(objects)

AWS S3 & Glacier, Google GCS, OpenStack Swift

Via Azure and externally from on-premise or other clouds using API, HTTP/Rest, various other tools and software

Block blobs -- General purpose storage for images, videos, photos, files or other objects. New blobs are created for updates. General file or data storage, backups and archives.

Block Append -- Logs, telemetry, events, errors or other files that data gets appended

Page blobs -- Read and writable such as image, virtual machines, VHD/VHDX

Tables

SQL Server

SQL Data Warehouse

SQL, NoSQL, and other RDS type services

Applications within Azure and on-premise, via SQL   Server and Visual Studio among others

Tables -- Lightweight table lookup

SQL Server -- Cloud based SQL Server

SQL Data Warehouse -- Data analytics and big data   processing

NoSQL -- Document and other database or repository for big data and little data

Files

(Preview)

AWS EFS, OpenStack Manila among others

Native within Azure, via HTTP Rest API from on   premise

File sharing using SMB 2.1 between Azure VMs or via remote on-premise systems via REST-based API. Similar to how standard SMB 2.1 based file sharing is used between Windows and other servers in a traditional or virtual environment.

Premium Storage

(Preview)

AWS EBS among others

Accessible within Azure cloud from VMs (instances)

Low latency high-performance persistent block storage accessible by Azure VM instances. HDD and SSD based for use similar   to traditional SAN, DAS, SAS or SATA attached storage.

Table 1: Microsoft Azure Storage Options Decision Making

The second table below shows different Azure data and storage resiliency options that vary in cost. The level of Azure residency you decide to use in Table 2 depends on the value of your data as well as importance or business benefit of your applications being available. Each company must find the proper balance between the application and data availability along with economic cost or benefit.

Azure   Resiliency

Description

Available

Durable

Usage

Locally Redundant Storage (LRS)

Three synchronous copies in same data center on different storage servers

99.9 read/write

Lower cost durable storage protects against hardware failure in the data center.

Zone Redundant Storage (ZRS)

Three synchronous copies across data centers and storage servers within or across regions

99.9 read/write

Durability and resiliency to protect across data centers and hardware failure. Higher cost, higher business, application availability benefit.

Geographic Redundant Storage (GRS)

Same as LRS plus extra asynchronous copies to second data center hundreds of miles away, six total copies

99.99 read/write

Protection from localized hardware (drive, server, rack, and network), software or data center failure and BC/DR protection. Higher cost, higher availability benefits.

Read Access Geography Redundant Storage (RA-GRS)

Same as GRS except secondary copies are read access

99.9 write

99.99 read

Use similar for GRS, however where secondary copies only need to be read vs. write.

Table 2: Azure data and storage resiliency options that vary in cost

Conclusion

To end this article here are some general storage decision-making tips and recommendations:

  • Know your application and workload performance, availability, capacity needs
  • Look beyond cost per GByte of storage per month
  • Evaluate different availability and durability for class of storage service
  • Understand performance limits along with associated access fees
  • Do your homework to be aware of other recurring fees to avoid surprises
  • Some services are accessible only within Azure such as premium storage
  • Other services such as blobs can be accessed externally to Azure
  • What tools, plugins, drivers or gateways will you need to add to your environment
  • I use the Azure portal, as well as tools such as Cloudberry, Cyberduck among others
  • Check out one of the various Microsoft free trials and tutorials to get some behind the wheel test driving time to understand better your options, and which Azure storage services best fit your specific needs.

Don't be scared of clouds or Microsoft Azure storage options, instead, be prepared, and know how to navigate your various options to meet your specific needs and requirements. Shopping for and evaluating cloud storage options is similar to what you would (or should) do for acquiring traditional physical storage. Granted, some of the names, terms, and access technologies will be different. Learn more about Azure, along with data and storage options here as well as at my blog http://www.storageioblog.com.

Additional Resources

For more information about Microsoft Azure storage resiliency and configuration options, see https://azure.microsoft.com/en-us/documentation/articles/storage-redundancy/.

See Also


The Author — Mitch Tulloch

Mitch Tulloch avatar

Mitch Tulloch is a well-known expert on Windows Server administration and cloud computing technologies. He has published over a thousand articles on information technology topics and has written, contributed to or been series editor for over 50 books.

Advertisement

Featured Links