Windows Azure is Microsoft's application platform for the public
cloud. Your applications can use this platform in many different ways.
For instance, you can use Windows Azure to build a web application that
runs and stores its data in Microsoft datacenters. You can use Windows
Azure only to store data, with the applications that use that data
running on-premises (that is, outside the public cloud). You can use
Windows Azure to help on-premises applications connect with each other
or to map between different sets of identity information or in other
ways. Because the platform offers a wide range of services, all of these
things—and more—are possible.
To do any of them, though, you need to understand the basics. Even if
you don't know anything about cloud computing, this article will walk
you through the fundamentals of Windows Azure. The goal is to give you a
foundation for understanding and using this cloud platform.
Table of Contents
- The Components of Windows Azure
- Cloud Applications
- Data Management
- Business Analytics
- Messaging
- Networking
- Caching
- High-Performance Computing (HPC)
- Commerce
- Identity
- SDKs
- Getting Started
The Components of Windows Azure
To understand what Windows Azure offers, it's useful to group its
services into distinct categories. Figure 1 shows one way to do this.
Figure 1: Windows Azure provides a set of cloud services running in Microsoft datacenters.
To understand Windows Azure, you need to know what its components do.
The rest of this article walks through the technologies shown in the
figure, describing what each one offers and when you might use it.
Cloud Applications
A Windows Azure application is much like one that runs in your own
datacenter. You write code in your preferred language, such as C#, Java,
PHP, Node.js, or something else. You then execute that code in virtual
machines running Windows Server. But because Windows Azure is designed
to help you create applications that are more reliable, more scalable,
and require less administration, creating a Windows Azure application
isn't exactly the same as building an on-premises application. Even
though most of what you already know about building software for Windows
Server applies to Windows Azure, there are a few new things you need to
understand.
In Windows Azure, every application is implemented as one or more roles.
Each role contains the code and configuration information required to
carry out some part of your application's function. Windows Azure today
provides two main types of roles: web roles and worker roles. A web role
is designed for code that interacts directly with web browsers or other
HTTP clients—it depends on IIS, Microsoft's web server. A worker role
is more general, designed to run a variety of code.
If you just want to create a simple ASP.NET or PHP application, for
example, you might use only a web role. For a more demanding
application, you might decide to use a web role to interact with users,
then rely on a worker role to carry out the requests those users make.
And in some cases, you'll use just a worker role, such as with an
application that processes lots of data in parallel.
Whatever roles you break your application into, the code for each one will execute in a role instance.
Each role instance is really just a virtual machine (VM) running a
flavor of Windows Server, and each one executes in a Windows Azure
datacenter. Figure 2 shows how this looks for a simple application that
runs two instances of a web role.
Figure
2: The code for every Windows Azure application runs in some number of
role instances, each of which is really a virtual machine.
In this example, each web role instance runs an identical copy of the
application's code, along with a version of Windows Server, and Windows
Azure automatically load balances all user requests across these two
instances. To deploy the application, a developer doesn't explicitly
create VMs. Instead, he gives the application's executable to Windows
Azure, indicating what kind of role instances (web or worker) he wants
and how many of each he needs. Windows Azure will create the specified
number of VMs for each role, then start a copy of the role's executable
in each one. The developer need only indicate what he wants, and the
platform does the rest.
Once an application is running, Windows Azure continues to monitor
it. If your code fails or the role instance it's running in crashes or
the physical machine the instance is executing on goes down, Windows
Azure will start a new instance of this role. The platform also handles
applying updates to the physical and virtual machines your application
relies on, including things like deploying new patched versions of the
operating system. Because of this, a Windows Azure application typically
runs two or more instances of each role. This lets the platform take
down and update one VM at a time while the application keeps on running.
Since it includes built-in services like these, Windows Azure fits into
the cloud computing category known as Platform as a Service (PaaS).
If the load on a Windows Azure application increases—maybe you've
acquired a large number of new users all at once, for instance—a
developer or the application itself can just ask for more instances. If
the load decreases, another request can shrink the number of instances.
And because Windows Azure charges you by the hour for each instance, you
pay only for the capacity you need. (See here for more on Windows Azure pricing.)
Data Management
Every Windows Azure application runs in one or more role instances,
i.e., in one or more VMs. Each VM has local storage, which an
application is free to use. Remember, though, that Windows Azure will
periodically shut down instances for maintenance. Because of this, data
that an application wishes to store persistently—which is almost
everything—must be stored outside of the VMs in which the application
runs. To allow this, Windows Azure provides three data management
options, as Figure 3 shows.
Figure 3: For data management, Windows Azure provides relational storage, scalable tables, and unstructured blobs.
Each of the three options addresses a different need: relational
storage, fast access to potentially large amounts of simple typed data,
and unstructured blob storage. In all three cases, data is automatically
replicated across three different computers in the Windows Azure
datacenter to provide high availability. As the figure shows, all three
options can be accessed either by Windows Azure applications or by
applications running elsewhere, such as your datacenter, your laptop, or
your phone. And however you apply them, you pay for all Windows Azure
data management services based on usage, including a gigabyte-per-month
charge for stored data. (Again, see here for pricing details.)
SQL Azure
For relational storage, Windows Azure provides SQL Azure. Think of
SQL Azure as a cloud-based analog of SQL Server. SQL Azure provides all
of the key features of a relational database management system (RDBMS),
such as transaction management, concurrent data access across multiple
users with data integrity, and queries via ANSI SQL. Like SQL Server, it
can be accessed using Entity Framework, ADO.NET, Java via JDBC, and
other familiar data access technologies. You can interact with SQL Azure
very much like you can with SQL Server; it supports most of the T-SQL
language and can be accessed using tools such as SQL Server Management
Studio and SQL Server Data Tools. For anybody familiar with SQL Server
(or even another relational database), using SQL Azure is simple.
But SQL Azure isn’t just a DBMS in the cloud—it’s a PaaS service. You
still control your data and who can access it, but SQL Azure takes care
of things like managing the hardware infrastructure and automatically
keeping the database and operating system software up-to-date. In other
words, it handles much of the administrative grunt work.
If you’re creating a Windows Azure application that needs relational
storage, SQL Azure is your best option today. Applications running
outside the cloud can also use this service, though, so there are plenty
of other scenarios. For instance, data stored in SQL Azure is easy to
access from different client systems, including desktops, laptops,
tablets, and phones, wherever they might be. And because it provides
built-in high availability through replication, using a SQL Azure
database minimizes downtime. SQL Azure also includes SQL Azure Data
Sync, which is a service that enables users to synchronize data between
on-premises SQL Server databases and cloud-based SQL Azure databases
without requiring any programming.
Tables
Suppose you want to create a Windows Azure application that needs
fast access to hundreds of gigabytes of typed data, but doesn’t need to
perform complex SQL queries on this data. For example, imagine you’re
creating a consumer application that needs to store customer profile
information for each user. Your app is going to be very popular, so you
need to allow for lots of data, but you won’t need to do much with this
data beyond storing it, then retrieving it in simple ways. This is
exactly the kind of scenario where Tables make sense.
Don’t be confused by the name: Tables don’t provide relational
storage. If you need RDBMS features such as referential data integrity,
database-managed transactions, and SQL query capabilities, your best
choice is SQL Azure. Instead, Tables let an application store properties
of various types, such as strings, integers, and dates. An application
can then retrieve a group of properties by providing a unique key for
that group. While complex operations like joins aren’t supported, Tables
offer fast access to typed data. They’re also very scalable, with a
single table containing as much as a terabyte of data. And matching
their simplicity, Tables are usually less expensive to use than SQL
Azure’s relational storage.
Blobs
The third Windows Azure option for data management, Blobs,
is designed to store unstructured binary data. Like Tables, Blobs are
cheap, and a single blob can be as large as one terabyte. An application
that stores video, for example, or backup data or other binary
information is likely to use blobs for simple, cheap storage. Windows
Azure applications can also use Windows Azure drives, which let blobs provide persistent storage for a Windows file system mounted in a Windows Azure instance.
Business Analytics
One of the most common ways to use stored data is to create reports
based on that data. To let you do this with relational data in SQL
Azure, Windows Azure provides SQL Azure Reporting. A subset of the
reporting services provided with SQL Server, SQL Azure Reporting lets
you build reporting into Windows Azure applications. The reports it
creates can be in many formats, including HTML, XML, PDF, Excel, and
others, and they can be embedded in applications or viewed via a web
browser.
Another option for doing analytics with SQL Azure data is to use
on-premises business intelligence tools. To a client, SQL Azure looks
just like SQL Server, and so the same technologies can work with both.
Messaging
No matter what it's doing, code frequently needs to communicate with
other code. One common way to do this is through queued messaging,
although other approaches can also make sense. Because different
applications have different requirements, Windows Azure provides two
different technologies for this kind of communication: Queues and
Service Bus.
Queues
The service provided by Windows Azure Queues is easy to
understand: One application places a message in a queue, which is
eventually read by another application. One common use of Queues today
is to let a web role instance communicate with a worker role instance,
as Figure 4 illustrates.
Figure 4: Queues are commonly used to let web role instances communicate with worker role instances.
For example, suppose you create a Windows Azure application for video
sharing. The application consists of PHP code running in a web role
that lets users upload and watch videos, together with a worker role
implemented in C# that translates uploaded video into various formats.
When a web role instance gets a new video from a user, it can store the
video in a blob, then send a message to a worker role via a queue
telling it where to find this new video. A worker role instance—it
doesn't matter which one—will then read the message from the queue and
carry out the required video translations in the background. Structuring
an application in this way allows asynchronous processing, and it also
makes the application easier to scale, since the number of web role
instances and worker role instances can be varied independently.
Service Bus
Along with Queues, Windows Azure provides Service Bus, a somewhat
more general approach to connecting software through the cloud. Service
Bus also provides a queuing service, but it's not identical to the
Queues just described. (For a more detailed comparison of the two, see here.) One common use of Service bus is to connect different applications, as Figure 5 shows.
Figure 5: Service Bus lets applications communicate either through queues or directly.
Applications that communicate through Service Bus might be Windows
Azure applications, for example, or software running on some other cloud
platform. They can also be applications running outside the cloud,
however. For example, think of an airline that implements reservation
services in computers inside its own datacenter. The airline needs to
expose these services to many clients, including check-in kiosks in
airports, reservation agent terminals, and maybe even customers' phones.
It might use Service Bus to do this, creating loosely coupled
interactions among the various applications.
Unlike Windows Azure Queues, Service Bus queues provide a
publish-and-subscribe mechanism. An application can send messages to one
or more topics, while other applications elect to receive only messages
sent to specific topics. This allows flexible one-to-many communication
among a set of applications. And queues aren't the only option: Service
Bus also allows direct communication through its relay service,
providing a secure way to interact through firewalls.
Networking
Windows Azure runs today in six datacenters: two in the United
States, two in Europe, and two in Asia. When you use Windows Azure, you
select one or more of these datacenters to run your application and/or
store your data. To route requests among these datacenters, Windows
Azure provides Traffic Manager. To connect on-premises servers to
applications in a particular datacenter, it offers the Connect service.
This section looks at both of these technologies.
Traffic Manager
An application with users in just a single part of the world might
run its role instances in just one datacenter. An application with many
users scattered around the world might run role instances in multiple
datacenters, maybe even all six of them. In this second situation, you
face a problem: How do you intelligently assign users to application
instances? Most of the time, you probably want each user to access the
datacenter closest to her, since it will likely give her the best
response time. But what if that copy of the application is overloaded or
unavailable? It would be nice to route her request automatically to
another datacenter. This is exactly what's done by Windows Azure Traffic Manager, as Figure 6 shows.
Figure
6: If your application runs in multiple datacenters, Windows Azure
Traffic Manager can route user requests intelligently across them.
The owner of an application defines rules that specify how requests
from users should be routed to datacenters, then relies on Traffic
Manager to carry out these rules. For example, users might normally be
routed to the closest Windows Azure datacenter, but get sent to another
one when the response time from their default datacenter exceeds a
certain threshold. For globally distributed applications with many
users, having a built-in service to handle problems like these
automatically is useful.
Connect
Another concern for the creator of a Windows Azure application is
connecting back to on-premises systems. For example, suppose you want to
write an application that runs on Windows Azure but accesses data
stored in a database on Windows Server inside your organization. To
address this problem, Windows Azure provides the Connect service, shown in Figure 7.
Figure
7: Windows Azure Connect makes it easy to establish a secure link
between an on-premises server and a Windows Azure application.
Connect provides a simple way to establish a secure IPsec connection
between a Windows Azure application and a computer running Windows
Server. A developer just installs the Connect software on the
on-premises server—there's no need to involve a network
administrator—and configures the Windows Azure application. Once this is
done, the application can communicate with the computer directly. It
can access a database on that machine, for instance, just as if it were
on the same local network.
Caching
Applications tend to access the same data over and over. One way to
improve performance is to cache that data closer to the application,
minimizing the time needed to retrieve it. Windows Azure provides two
different caching services: one for in-memory caching of data used by
Windows Azure applications and a second that caches blob data on disk
closer to its users.
In-Memory Caching
Accessing data stored in any of Windows Azure's data management
services—SQL Azure, Tables, or Blobs—is quite fast. Yet accessing data
stored in memory is even faster. Because of this, keeping an in-memory
copy of frequently accessed data can improve application performance. To
allow this, Windows Azure includes In-Memory Caching, illustrated in Figure 8.
Figure 8: In-Memory Caching speeds up a Windows Azure application's access to frequently used data.
An application can store data in this cache, then retrieve it
directly without needing to access persistent storage. For better
performance and reliability, the cache is implemented as a distributed
service, with the data it contains spread across multiple computers in a
Windows Azure datacenter.
An application that repeatedly reads a product catalog might benefit
from using In-Memory Caching, for example, since the data it needs will
be available more quickly. The technology also supports locking, letting
it be used with read/write as well as read-only data. And ASP.NET
applications can use the service to store session data with just a
configuration change.
Content Delivery Network (CDN)
Suppose you need to store blob data that will be accessed by users
around the world. Maybe it's a video of the latest World Cup match, for
instance, or driver updates, or a popular e-book. Storing a copy of the
data in all six Windows Azure datacenters will help, but if there are
lots of users, it's probably not enough. For even better performance,
you can use the Windows Azure CDN, illustrated in Figure 9.
Figure
9: The Windows Azure CDN stores a copy of a blob at dozens of locations
around the world, letting users in different countries access that blob
more quickly.
The CDN has dozens of sites around the world, each capable of storing
copies of Windows Azure blobs. The first time a user in some part of
the world accesses a particular blob, the information it contains is
copied from a Windows Azure datacenter into the local CDN storage in
that geography. After this, accesses from this part of the world will
use the copy cached in the CDN—they won't need to go all the way to the
nearest Windows Azure datacenter. The cached data has a configurable
timeout, after which a request will cause a new copy to be transferred
into local CDN storage. The result is faster access to frequently
accessed data by users anywhere in the world.
High-Performance Computing (HPC)
One of the most attractive ways to use a cloud platform is for parallel processing. Commonly known as high-performance computing (HPC),
this approach relies on executing code on many machines at the same
time. On Windows Azure, this means running many role instances
simultaneously, all working in parallel to solve some problem. Doing
this requires some way to schedule applications, which means
distributing their work across these instances. To allow this, Windows
Azure provides the HPC Scheduler. Figure 10 shows a simple picture of this technology.
Figure 10: The HPC Scheduler schedules parallel applications that run simultaneously in multiple role instances.
This service can work with HPC applications built to use the
industry-standard Message Passing Interface (MPI). Software that does
finite element analysis, such as car crash simulations, is one example
of this type of application, and there are many others. The HPC
Scheduler can also be used with so-called embarrassingly parallel
applications, such as Monte Carlo simulations. Whatever problem is
addressed, the value this component provides is the same: It handles the
complex problem of scheduling parallel computing work across many
Windows Azure role instances.
Commerce
The rise of Software as a Service (SaaS) is transforming how we
create applications. It's also transforming how we sell applications.
Since a SaaS application lives in the cloud, it makes sense that its
potential customers should also look for solutions online. This change
applies to data as well as to applications. Why shouldn't people look to
the cloud for commercially available datasets?
Microsoft addresses both of these concerns with Windows Azure Marketplace.
Potential customers can search the Marketplace to find Windows
Azure-based applications that meet their needs, then sign up to use them
either through the application's creator or directly through the
Marketplace. Customers can also search the Marketplace for commercial
datasets, including demographic data, financial data, and other
offerings. When they find something they like, they can access it either
from the vendor or directly through the Marketplace. Dataset vendors
have the choice of storing their information themselves or on Windows
Azure.
Identity
Working with identity is part of most applications. Knowing who a
user is lets an application decide how it should interact with that
user. To help do this, Microsoft provides Windows Azure Active Directory.
Over time, this directory service will expand to include a broad
range of traditional identity services. The first thing it provides,
however, called the Access Control Service (ACS), addresses a specific set of identity problems, including these:
-
ACS makes it easy for an application to accept identity information
from Facebook, Google, Windows Live ID, and other popular identity
providers. Rather than requiring the application to understand the
diverse data formats and protocols used by each of these providers, ACS
translates all of them into a single common token format.
-
ACS lets an application accept logins from one or more Active
Directory domains. Just as ACS rationalizes the identity information
provided by various Internet identity providers, it can also provide
this service for enterprise Active Directory identities. For example, a
vendor providing a SaaS application might use ACS to give users in each
of its customers single sign-on to the application.
-
ACS lets the owner of an application define rules for working with
and transforming a user's identity information outside of the
application itself. ACS can indicate that access should be denied, for
instance, or convert identity information into a particular format
required by an application.
SDKs
Windows Azure is a Windows Server-based environment, but it's not
restricted to .NET—you can create Windows Azure applications in pretty
much any language. Microsoft provides language-specific SDKs today for
.NET, Java, PHP, and Node.js, and there's also a general Windows Azure
SDK that provides basic support for any language, such as C++ or Python.
These SDKs can be used with Visual Studio and Eclipse, and they're
available either from Microsoft's Windows Azure site or on github.
Windows Azure also offers command line tools that developers can use
with any editor or development environment. Whatever language and tool
you choose, Windows Azure provides your application with a reliable,
scalable platform with low administration requirements.
Along with helping you build Windows Azure applications, these SDKs
also support creating applications running outside the cloud that use
Windows Azure services. For example, you might build an application
running at a hoster that relies on Windows Azure blobs, or create a tool
that automatically deploys Windows Azure applications through the
platform's management interface.
No comments:
Post a Comment