Cloud computing perspectives and questions

From WikiContent

(Difference between revisions)
Jump to: navigation, search
m (Link to announcement)
Current revision (14:22, 18 August 2009) (edit) (undo)
(Tried to narrow topic of wiki)
 
(27 intermediate revisions not shown.)
Line 5: Line 5:
Andy Oram was asked to provide some ideas on the implications of cloud
Andy Oram was asked to provide some ideas on the implications of cloud
-
computing on business as well as its future operating environment.
+
computing for business as well as its future operating environment.
This wiki is a discussion forum where anyone with relevant and valid
This wiki is a discussion forum where anyone with relevant and valid
-
ideas can suggest points for his reply.
+
ideas can suggest points for ongoing research into the social and
 +
economic issues (as well as relevant technical issues).
-
= Resilience =
+
Cloud computing is a huge topic, of course, spawning whole fields of study (as well as a lot of hype). This wiki tries to focus on long-term social and economic effects, especially on a global basis.
-
* What degree of geographic distribution offers sufficient safety for:
+
= Definitions of cloud computing =
-
** Individuals or small companies
+
* WEF definition is very broad
-
** Major corporations and organizations with reliability requirements
+
* Definitions tend to be complex and controversial
-
** Defense and other sensitive government functions
+
* Most observers agree on different approaches that define different relationships between client and provider:
-
* Benefits of automatically distributing files, perhaps among multiple vendors (example; [http://www.cleversafe.com/ Cleversafe])
+
** Software as a Service (Saas)
-
* Potential targets for attack in war or by terror
+
*** Most computing and often the data storage is performed by vendor. Access by client is through browser or other thin client software.
-
* Should there be resilience standards?
+
*** A very old model, called Application Service Providers in the 1990s.
-
* Related to [[#Portability]]
+
*** Now encompasses:
-
= Portability =
+
**** Well-established services such as Salesforce.com
-
* Related to [[#Resilience]] and [[#Free software]]
+
*** Popular storage and social networking services such as Google Docs and Flickr
-
* Importance: Backups are recommended for persistent data to another system or service outside of the cloud.
+
*** Services offered to cell phone users
-
* Feasibility: All APIs can be emulated, so in theory organizations can use the same scripts and procedures to replicate operations in multiple services
+
** Infrastructure as a Service (Saas)
-
* Trends: There are calls for "open cloud computing," referring to standards that would facilitate portability.
+
*** Offers virtual environments where clients can build or load software representations of entire computer systems
-
** Standards could lead to automatic, instant migration between cloud vendors.
+
*** Also has long history, if one counts time-sharing
-
** As with all standardization, it's hard to:
+
*** Now covers platforms such as Amazon.com EC2
-
*** Get vendors to cooperate on advances that would reduce client lock-in
+
*** It's notable that first service was by a large consumer of computing power (Amazon.com) instead of a computer vendor or software company
-
*** Slow down innovation in an emerging technology enough to produce a standard
+
** Platform as a Service (Saas)
 +
 
 +
*** Offers a programming interface where clients can build new applications. More flexible from client point of view than SaaS (which offers a single service, albeit often with plug-ins and APIs) but less flexible than IaaS (which offers the opportunity to run complete operating systems with multiple applications)
 +
 
 +
*** Recent innovation
 +
 
 +
*** Best-known example is Google App Engine
 +
 
 +
* Data can also be stored in the cloud
 +
 
 +
** Replication may be a substitute for back-ups
 +
 
 +
** Some services build in replication, often by partitioning data in such a way that a subset of replicas can rebuild the entire data
 +
 
 +
* The term "private cloud" refers to virtualization implemented within a single organization to support its own operations
 +
 
 +
** Can provide some of the same flexibility and culture change as use of an external cloud
 +
 +
* Peer-to-peer systems permit clients to coordinate storage
 +
 
 +
** Still at a young stage
 +
 
 +
** Relatively well-known examples include:
 +
 
 +
*** [http://allmydata.org/source/tahoe/trunk/docs/about.html Tahoe-LAFS]
 +
 
 +
*** Jesse Vincent's Prophet
 +
 
 +
= APIs and mash-ups =
 +
 
 +
* Application Programming Interfaces (APIs) are central to all models
 +
 
 +
* APIs permit mash-ups, which may become much more pervasive and even dominate over stand-alone applications eventually
 +
 
 +
* Possibility of mash-ups can shift innovation to the periphery and democratize it,
= Benefits and drawbacks for potential clients =
= Benefits and drawbacks for potential clients =
 +
 +
* Organizations may be formed without cost of creating a systems and communications infrastructure
 +
 +
** Allows new organizations to be formed with minimal overhead
 +
 +
** Existing organizations can change personnel, move, experiment, and deploy new services rapidly
 +
 +
** Less reliance on a central IT group to provision servers
 +
 +
** May disrupt old centers of power and decision-making, somewhat as the desktop PC did in the 1980s
 +
 +
** Enables virtual organizations -- with no physical infrastructure, just shared data and processes
* Total reliance on a cloud service (virtual machine services or SaaS)
* Total reliance on a cloud service (virtual machine services or SaaS)
Line 54: Line 101:
** Requires a thorough understanding of the cloud service's operations, the risks involved, and management techniques to handle the service and its risks.
** Requires a thorough understanding of the cloud service's operations, the risks involved, and management techniques to handle the service and its risks.
 +
 +
** SaaS allows vendor to change or remove features capriciously, and clients cannot choose to keep old version by rejecting the upgrade
* Use of cloud to supplement in-house operations
* Use of cloud to supplement in-house operations
-
** May be useful for:
+
** Useful for:
 +
 
 +
*** Capital-poor companies
 +
 
 +
*** Companies with growth rates that they can't support
*** Handling peaks and spikes
*** Handling peaks and spikes
-
*** Planning growth that will eventually be moved in-house
+
*** Handling large variations in their normal business volume
-
** Requires skills in both domains (in-house and cloud) as well as strategies for migrating and replicating between them.
+
*** Handling growth that will eventually be moved in-house
 +
 
 +
*** Offloading in-house systems for updating, testing and installing major changes
 +
 
 +
** Requires skills in both domains (in-house and cloud) as well as strategies for migrating and replicating between them.
 +
 +
** Best if the cloud supplier can offer strong service level guarantees (SLAs)
 +
 
 +
** If clients' system administrators are deskilled by outsourcing system administration, can reduce companies' competence to judge SLAs and negotiate safe contracts.
 +
 
 +
* Costs and potential savings
 +
 
 +
* Much disagreement over costs of system administration after move to a cloud -- many sysadmin tasks are just as complex and demanding as with stand-alone systems
 +
 
 +
* Sunk costs in existing hardware may slow move to cloud computing
 +
 
 +
* Effects on innovation
 +
 
 +
** In addition to possibly reducing costs, the provision of a platform can simplify tasks for clients developing new uses and applications
 +
 
 +
*** IaaS removes the need to manage hardware and other physical aspects of data centers
 +
 
 +
*** PaaS provides the benefits of IaaS and also handles many of the logistical and configuration tasks that programmers traditionally have to handle
 +
 
 +
** Choices made by the platform vendor can also, however, limit options for clients
 +
 
 +
= Benefits and drawbacks for vendor of offering software as a service or using a development environment =
 +
 
 +
* Benefits are extremely compelling
 +
 
 +
** Project start-up can be faster and cheaper
 +
 
 +
** Potential clients can use software simply by visiting a web page--no need to download anything, unless a plugin is desired
 +
 
 +
** Updates are immediate and do not require client action
 +
 
 +
** Testing can be simplified by simply cloning an instance of the software environment
 +
 
 +
** Reduced support costs because vendor doesn't have to deal with many divergent versions of software in the field (although differences in browsers and other aspects of the environment can still affect the service's behavior)
 +
 
 +
* Many free software developers already use a service such as SourceForge or Launchpad to develop and distribute software
 +
 
 +
* Drawbacks
 +
 
 +
** Main drawback, especially when using cloud service at a relatively high level (development environment or SaaS instead of virtual machines) is delivery through a web browser instead of running with native code
 +
 
 +
*** Performance impacts (diminishing as technology improves)
 +
 
 +
*** Lack of access to features of the operating system
 +
 
 +
*** Restrictions on user interface (diminishing as technology improves)
 +
 
 +
** Other drawbacks are the same as for other organizations
 +
 
 +
*** Administration may be more difficult, at least at current stages of the field's development
 +
 
 +
*** Costs of using a virtual service may be higher than stand-alone servers for large projects
 +
 
 +
*** Development tailored to a particular development environment such as Google AppServer or Windows Azure may limit portability
 +
 
 +
= Access =
 +
 
 +
* The requirement that clients have network access makes cloud services inaccessible or difficult for:
 +
 
 +
** People without Internet access (much of the developing world)
 +
 
 +
** People with very slow Internet access (many areas in both the developing and developed world)
 +
 
 +
** People without continuous Internet access (dial-up, also still common in both the developing and developed world)
 +
 
 +
* On the other hand, services that are parsimonious in the use of bandwidth and client-side compute power can (through mobile devices) extend new services to previously cut-off populations.
 +
 
 +
** Low computing power requirements on the client side simultaneously lower the cost of the client (e.g. PC, laptop, etc.)
 +
 
 +
** SaaS application vendors are viewing mobile devices as an important part of their application stack
 +
 
 +
= Resilience =
 +
 
 +
* What degree of geographic distribution offers sufficient safety for:
 +
 
 +
** Individuals or small companies
 +
 
 +
** Major corporations and organizations with reliability requirements
 +
 
 +
** Defense and other sensitive government functions
 +
 
 +
* Benefits of automatically distributing files, perhaps among multiple vendors (example; [http://www.cleversafe.com/ Cleversafe])
 +
 
 +
* Potential targets for attack in war or by terror
 +
 
 +
* Should there be resilience standards?
 +
 
 +
* Related to [[#Portability]]
 +
 
 +
= Portability =
 +
 
 +
* Related to [[#Resilience]] and [[#Software freedom]]
 +
 
 +
* Importance: Backups are recommended for persistent data to another system or service outside of the cloud.
 +
 
 +
* Feasibility: All APIs can be emulated, so in theory organizations can use the same scripts and procedures to replicate operations in multiple services
 +
 
 +
* Trends: There are calls for "open cloud computing," referring to standards that would facilitate portability.
 +
 
 +
** Standards could lead to automatic, instant migration between cloud vendors.
 +
 
 +
** As with all standardization, it's hard to:
 +
 
 +
*** Get vendors to cooperate on advances that would reduce client lock-in
 +
 
 +
*** Slow down innovation in an emerging technology enough to produce a standard
= Environmental implications =
= Environmental implications =
Line 71: Line 234:
* Impacts on localities where huge server farms are built.
* Impacts on localities where huge server farms are built.
-
= Free software =
+
= Software freedom =
* Related to [[#Portability]]
* Related to [[#Portability]]
Line 118: Line 281:
* Should governments collaborate on producing public-domain or open-source social networks and cloud services tailored to their needs?
* Should governments collaborate on producing public-domain or open-source social networks and cloud services tailored to their needs?
 +
 +
= Cloud computing standards =
 +
 +
* Much talk of "open cloud computing" that would facilitate moving instances of servers or applications between vendors
 +
 +
* Standards for cloud performance measurement and rating
 +
 +
** Attempt to support SLAs and permit vendors to compete on price/performance
 +
 +
** Compute-cycle measurement [http://www.elasticvapor.com/2009/05/redux-universal-compute-unit-compute.html proposal by Reuven Cohen]
 +
 +
= Security in cloud computing =
 +
 +
* Data and operations in the cloud can be as secure as they are on physical systems, and the basic building blocks of security are the same
 +
 +
* But the weights of different threats and solutions are different inside and outside the cloud
 +
 +
** To protect data from unscrupulous service vendors, from those who can potentially break in, and from government demands, encryption becomes more important (see [[#Legal and privacy considerations]])
 +
 +
** Because systems are not on a local area network, administrators must create virtual LANs or forgo the use of protocols and services that rely on the cooperation of benevolent parties on a local area network (includes popular filesystem sharing protocols such as Microsoft's CIFS and Unix's NFS)
 +
 +
** Access is also controlled by password, not by IP address or other indications of physical location
 +
 +
** Firewalls, both on the IP and application levels, are still valuable, although the concept of a DMZ or of being "behind the firewall" is virtual, not tied to the architecture of a physical network
 +
 +
** Backups and replication are crucial because virtual servers often go down and all data can be lost
 +
 +
* Clouds might provide a recovery strategy for distributed denial-of-service attacks [http://www.elasticvapor.com/2009/07/federal-cloudbursting-cyber-defense.html Cloudbursting proposal by Reuven Cohen]
 +
 +
= Legal and privacy considerations =
 +
 +
* When data is stored on a vendor's server, the client must trust the vendor or encrypt all data (see [[#Security in Cloud Computing]])
 +
 +
* Questions about ownership of data (particularly for Saas)
 +
 +
** When different clients' data has additional value when aggregated (such as in social networks), the vendor may claim ownership
 +
 +
*** Clients can usually delete items of data, but bulk deletions may be difficult
 +
 +
*** The vendor usually leaves copyright with the client, but claims certain rights to use
 +
 +
** The vendor may use information about the client for marketing and other purposes, depending on the terms of service and privacy policy
 +
 +
** Questions whether vendors can be held liable for illegal use of systems or illegal content stored on them
 +
 +
* In the United States, data stored by a third party is more easily demanded by governments than data on the client's own systems (which require a subpoena)
 +
 +
* Regulations (such as HIPAA and Sarbanes-Oxley in the United States) might rule out the use of a cloud or increase the difficulties of using one
 +
 +
* Jurisdiction about laws apply to the location where data the data is stored, not the location of the client
 +
 +
** For instance, European Union privacy laws require personal data to be stored in a jurisdiction that respects their privacy directive
 +
 +
** Some governments require services to pay local taxes for services rendered to clients in those jurisdictions ([http://luminotes.com/blog/big-news-luminotes-is-now-free Luminotes case]), adding complexity to transactions

Current revision

The World Economic Forum started a research project at Davos 2009 concerning cloud computing, which they broadly define to include all kinds of remote services, from Software as a Service to virtual machines.

Andy Oram was asked to provide some ideas on the implications of cloud computing for business as well as its future operating environment. This wiki is a discussion forum where anyone with relevant and valid ideas can suggest points for ongoing research into the social and economic issues (as well as relevant technical issues).

Cloud computing is a huge topic, of course, spawning whole fields of study (as well as a lot of hype). This wiki tries to focus on long-term social and economic effects, especially on a global basis.

Contents

Definitions of cloud computing

  • WEF definition is very broad
  • Definitions tend to be complex and controversial
  • Most observers agree on different approaches that define different relationships between client and provider:
    • Software as a Service (Saas)
      • Most computing and often the data storage is performed by vendor. Access by client is through browser or other thin client software.
      • A very old model, called Application Service Providers in the 1990s.
      • Now encompasses:
        • Well-established services such as Salesforce.com
      • Popular storage and social networking services such as Google Docs and Flickr
      • Services offered to cell phone users
    • Infrastructure as a Service (Saas)
      • Offers virtual environments where clients can build or load software representations of entire computer systems
      • Also has long history, if one counts time-sharing
      • Now covers platforms such as Amazon.com EC2
      • It's notable that first service was by a large consumer of computing power (Amazon.com) instead of a computer vendor or software company
    • Platform as a Service (Saas)
      • Offers a programming interface where clients can build new applications. More flexible from client point of view than SaaS (which offers a single service, albeit often with plug-ins and APIs) but less flexible than IaaS (which offers the opportunity to run complete operating systems with multiple applications)
      • Recent innovation
      • Best-known example is Google App Engine
  • Data can also be stored in the cloud
    • Replication may be a substitute for back-ups
    • Some services build in replication, often by partitioning data in such a way that a subset of replicas can rebuild the entire data
  • The term "private cloud" refers to virtualization implemented within a single organization to support its own operations
    • Can provide some of the same flexibility and culture change as use of an external cloud
  • Peer-to-peer systems permit clients to coordinate storage
    • Still at a young stage
    • Relatively well-known examples include:
      • Jesse Vincent's Prophet

APIs and mash-ups

  • Application Programming Interfaces (APIs) are central to all models
  • APIs permit mash-ups, which may become much more pervasive and even dominate over stand-alone applications eventually
  • Possibility of mash-ups can shift innovation to the periphery and democratize it,

Benefits and drawbacks for potential clients

  • Organizations may be formed without cost of creating a systems and communications infrastructure
    • Allows new organizations to be formed with minimal overhead
    • Existing organizations can change personnel, move, experiment, and deploy new services rapidly
    • Less reliance on a central IT group to provision servers
    • May disrupt old centers of power and decision-making, somewhat as the desktop PC did in the 1980s
    • Enables virtual organizations -- with no physical infrastructure, just shared data and processes
  • Total reliance on a cloud service (virtual machine services or SaaS)
    • May be valuable for start-ups and skunkworks
    • For larger organizations, useful for some well-defined functions, particularly non-critical ones. (But note that many companies use services for customer relations management and for paying employees, which could be considered critical functions.)
    • Requires a thorough understanding of the cloud service's operations, the risks involved, and management techniques to handle the service and its risks.
    • SaaS allows vendor to change or remove features capriciously, and clients cannot choose to keep old version by rejecting the upgrade
  • Use of cloud to supplement in-house operations
    • Useful for:
      • Capital-poor companies
      • Companies with growth rates that they can't support
      • Handling peaks and spikes
      • Handling large variations in their normal business volume
      • Handling growth that will eventually be moved in-house
      • Offloading in-house systems for updating, testing and installing major changes
    • Requires skills in both domains (in-house and cloud) as well as strategies for migrating and replicating between them.
    • Best if the cloud supplier can offer strong service level guarantees (SLAs)
    • If clients' system administrators are deskilled by outsourcing system administration, can reduce companies' competence to judge SLAs and negotiate safe contracts.
  • Costs and potential savings
  • Much disagreement over costs of system administration after move to a cloud -- many sysadmin tasks are just as complex and demanding as with stand-alone systems
  • Sunk costs in existing hardware may slow move to cloud computing
  • Effects on innovation
    • In addition to possibly reducing costs, the provision of a platform can simplify tasks for clients developing new uses and applications
      • IaaS removes the need to manage hardware and other physical aspects of data centers
      • PaaS provides the benefits of IaaS and also handles many of the logistical and configuration tasks that programmers traditionally have to handle
    • Choices made by the platform vendor can also, however, limit options for clients

Benefits and drawbacks for vendor of offering software as a service or using a development environment

  • Benefits are extremely compelling
    • Project start-up can be faster and cheaper
    • Potential clients can use software simply by visiting a web page--no need to download anything, unless a plugin is desired
    • Updates are immediate and do not require client action
    • Testing can be simplified by simply cloning an instance of the software environment
    • Reduced support costs because vendor doesn't have to deal with many divergent versions of software in the field (although differences in browsers and other aspects of the environment can still affect the service's behavior)
  • Many free software developers already use a service such as SourceForge or Launchpad to develop and distribute software
  • Drawbacks
    • Main drawback, especially when using cloud service at a relatively high level (development environment or SaaS instead of virtual machines) is delivery through a web browser instead of running with native code
      • Performance impacts (diminishing as technology improves)
      • Lack of access to features of the operating system
      • Restrictions on user interface (diminishing as technology improves)
    • Other drawbacks are the same as for other organizations
      • Administration may be more difficult, at least at current stages of the field's development
      • Costs of using a virtual service may be higher than stand-alone servers for large projects
      • Development tailored to a particular development environment such as Google AppServer or Windows Azure may limit portability

Access

  • The requirement that clients have network access makes cloud services inaccessible or difficult for:
    • People without Internet access (much of the developing world)
    • People with very slow Internet access (many areas in both the developing and developed world)
    • People without continuous Internet access (dial-up, also still common in both the developing and developed world)
  • On the other hand, services that are parsimonious in the use of bandwidth and client-side compute power can (through mobile devices) extend new services to previously cut-off populations.
    • Low computing power requirements on the client side simultaneously lower the cost of the client (e.g. PC, laptop, etc.)
    • SaaS application vendors are viewing mobile devices as an important part of their application stack

Resilience

  • What degree of geographic distribution offers sufficient safety for:
    • Individuals or small companies
    • Major corporations and organizations with reliability requirements
    • Defense and other sensitive government functions
  • Benefits of automatically distributing files, perhaps among multiple vendors (example; Cleversafe)
  • Potential targets for attack in war or by terror
  • Should there be resilience standards?

Portability

  • Importance: Backups are recommended for persistent data to another system or service outside of the cloud.
  • Feasibility: All APIs can be emulated, so in theory organizations can use the same scripts and procedures to replicate operations in multiple services
  • Trends: There are calls for "open cloud computing," referring to standards that would facilitate portability.
    • Standards could lead to automatic, instant migration between cloud vendors.
    • As with all standardization, it's hard to:
      • Get vendors to cooperate on advances that would reduce client lock-in
      • Slow down innovation in an emerging technology enough to produce a standard

Environmental implications

  • Energy trade-offs between concentrated megaservers and smaller systems distributed around the world.
  • Impacts on localities where huge server farms are built.

Software freedom

  • Cloud eviscerates software freedom:
    • New software and patches can be built on free software while still being hidden behind the cloud (except free software under the rarely used Affero GPL).
    • (Mostly in regard to Saas) Even releasing the source code would have little to no effect, because the real lock-in for cloud services is its role as central repository: storing the data and (for sites with community aspects) providing connections among different visitors.
  • Solutions:
    • Open formats so clients can extract data and reuse it elsewhere
    • As alternative to centralized services, promote radically distributed systems
      • Individuals maintain control of their own data and data processing and peer with others to share data and processing.

Government use

  • Use of popular cloud services (such as Google Docs)
    • Benefits
      • Familiar to staff and public alike, and therefore easy to promote use
      • Quick and cheap to set up
      • Allows integration of government message and discussion with other popular forums
    • Drawbacks
      • Often have policies that run counter to government needs:
        • Services may access visitor data in ways that treat privacy cavalierly.
        • Services may force visitors to take on liability requirements that governments cannot do.
      • Lack the reliability, and sometimes the security, that the public has a right to expect of government services.
      • May not have features governments need.
  • Should governments collaborate on producing public-domain or open-source social networks and cloud services tailored to their needs?

Cloud computing standards

  • Much talk of "open cloud computing" that would facilitate moving instances of servers or applications between vendors
  • Standards for cloud performance measurement and rating
    • Attempt to support SLAs and permit vendors to compete on price/performance

Security in cloud computing

  • Data and operations in the cloud can be as secure as they are on physical systems, and the basic building blocks of security are the same
  • But the weights of different threats and solutions are different inside and outside the cloud
    • To protect data from unscrupulous service vendors, from those who can potentially break in, and from government demands, encryption becomes more important (see #Legal and privacy considerations)
    • Because systems are not on a local area network, administrators must create virtual LANs or forgo the use of protocols and services that rely on the cooperation of benevolent parties on a local area network (includes popular filesystem sharing protocols such as Microsoft's CIFS and Unix's NFS)
    • Access is also controlled by password, not by IP address or other indications of physical location
    • Firewalls, both on the IP and application levels, are still valuable, although the concept of a DMZ or of being "behind the firewall" is virtual, not tied to the architecture of a physical network
    • Backups and replication are crucial because virtual servers often go down and all data can be lost

Legal and privacy considerations

  • Questions about ownership of data (particularly for Saas)
    • When different clients' data has additional value when aggregated (such as in social networks), the vendor may claim ownership
      • Clients can usually delete items of data, but bulk deletions may be difficult
      • The vendor usually leaves copyright with the client, but claims certain rights to use
    • The vendor may use information about the client for marketing and other purposes, depending on the terms of service and privacy policy
    • Questions whether vendors can be held liable for illegal use of systems or illegal content stored on them
  • In the United States, data stored by a third party is more easily demanded by governments than data on the client's own systems (which require a subpoena)
  • Regulations (such as HIPAA and Sarbanes-Oxley in the United States) might rule out the use of a cloud or increase the difficulties of using one
  • Jurisdiction about laws apply to the location where data the data is stored, not the location of the client
    • For instance, European Union privacy laws require personal data to be stored in a jurisdiction that respects their privacy directive
    • Some governments require services to pay local taxes for services rendered to clients in those jurisdictions (Luminotes case), adding complexity to transactions
Personal tools