Unpacking Microsoft Azure Cosmos DB Encryption Limitations
- Maor Volokh
- 4 minute read
Microsoft’s Azure Cosmos DB is a globally distributed, multi-modal database service that enables enterprises to scale high-performance applications, throughput and storage across geographical regions at request volumes of up to trillions of requests per day. This managed database has been gaining popularity among enterprises seeking to reduce IT costs and management overhead while migrating big data and other related workloads to cloud environments.
From our discussions with clients we can see that Cosmos DB most common use cases focus on storing catalog data and order processing pipelines, IoT and telematics, and web and mobile applications. The database service also supports data models including key-value, documents, graphs and columnar as well as data access APIs for MongoDB, SQL API, Gremlin API, and Tables API. Understanding encryption and security for this cloud based service is key for successful development in test environments as well as production deployments.
Azure Data Security - who is responsible?
Similar to other cloud vendors, Azure utilizes a shared responsibility security model where the cloud provider ensures security for the physical infrastructure, network elements and data security, while customers are responsible for protecting the security and privacy of their data. The following model details the responsibilities of the cloud provider versus customer:
When using Azure Cosmos DB, PaaS data is automatically encrypted using AES-256 encryption across all regions both at rest and in transit. There are no Azure controls to turn it off or on as it is on by default. Cosmos DB media and backups are also stored and encrypted in a blob storage and backed up on Azure HDDs.
Azure Encryption in transit and at rest
The Microsoft Azure Cosmos DB encryption service supports encrypting data on two levels: in transit and at rest. Data at rest includes storage objects, and physical volumes. Data in transit includes the protection of data while being transferred over any private and public networks during application processes and storage.
Microsoft recommends always using SSL/TLS encryption protocols for network data exchange, data privacy and compliance. All Azure storage data is encrypted and decrypted using 256-bit AES encryption and is FIPS 140-2 compliant. Encrypted resources are accessible via Azure Storage Service Encryption, Azure Key Vault, and Azure Active Directory with azure encryption at rest set as the default state for Cosmos DB.
Azure Storage Service Encryption (SSE) can automatically encrypt data before it is stored and automatically decrypts it when the data is retrieved with the process being transparent to cloud resource users.
The Azure Key Vault service stores, protects and manages cryptographic keys and secrets for customer cloud applications and services by default. Keys can easily be provisioned for application and data encryption across testing, development and production environments with security admins having the ability to grant or revoke access to keys. The Azure Key Vault service also allows for “BYOK” or “Bring Your Own Key” functionality for full control over a tenant. In this case, the private key is added to the Key Vault and managed with Azure Information Protection.
Azure Active Directory acts as the Identity as a Service (IDaaS) solution enabling customers to store and manage encrypted identity and access credentials in the cloud. An Azure AD instance is an isolated set of Directory Object Data provisioned and owned by the customer in the Azure AD core store and is scoped to a single-tenant based on the user’s security token for tenant isolation.
Understanding Azure Encryption Limitations
When considering Azure Cosmos DB encryption at rest, there are two security models that must be taken into consideration: “Client Encryption” and “Server-side Encryption.” Data encrypted with azure encryption in transit should be addressed by the transport protocol and is generally not a factor in determining which encryption at rest model to use.
The Client-side Encryption model refers to encryption that is performed from the client’s trusted organization environment. According to Microsoft, “the encryption can be performed by the service application in Azure, or by an application running in the customer data center”. In both cases, when leveraging the client encryption model, the Azure Resource Provider receives encrypted data without the ability to decrypt the data in any way or have access to the encryption keys and the key management is done by the calling service or application while being opaque to the Azure service.”
The Server-side Encryption model refers to encryption performed by the service. Azure Storage, for example, can receive data in plain text and will perform the encryption and decryption using internal Azure functions. The Resource Provider may use encryption keys managed by Microsoft or by the customer depending on the provided configuration with the BYOK methodology.
When considering these two services one must take into account the following issues:
- First of all, when looking at both options - keys will always have to be available within cloud infrastructure. The security grade of encryption depends heavily on the location of encryption keys - if these two are located within the same environment then access to data is a whole lot easier for hostile entities.
- For client-side encryption, IT leaders should consider the fact that without the ability to decrypt, cloud-based logic might cease to operate which will lead to either - client-side logic (not salable) or loss of functionality.
- When choosing BYOK to tackle the issue of losing control over encryption keys one must take into account that
- Both encryption keys and encrypted records are still accessible within the cloud infrastructure - meaning, keys are accessible to malicious insiders and potential external attackers, making the organization vulnerable to data breach.
- When using database encryption, data is available in plan-text in the application server's memory as well as the database memory. Thus, making it vulnerable to memory dump attacks, etc
- In both cases - the key lifecycle is managed by Azure. Key management, key storage and all key related actions are performed by the cloud vendor. The main outcomes are the fact that encryption keys do not remain on-premise or within the client's defined trusted environment and control over the keys is willingly handed to a third party.
- The last point but the most important is the fact that Azure encryption services are compatible only with Azure infrastructure. When building your cloud data protection stack you must take into account the chance of utilizing multi and hybrid cloud models. As you incorporate more cloud services that are exclusive to each provider’s infrastructure you add complexity to the following layers -
- There is management overhead - since there are multiple environments, each operating with its own set of APIs and SDKs
- There is a deployment overhead - since all of these services usually need to communicate with one or multiple on-prem components. Developers and Configuration Management resources must understand varying API service definitions, assess which functionality or capability features to enable within each product, and determine how to best integrate the products into the enterprise.
To sum up - Azure’s services may look as the easiest way to achieve a decent level of data protection for CosmosDB - and they have good features and functionality for a variety of use cases. However, for Enterprises migrating their most sensitive applications there is a need to consider all aspects including limitations and downsides before choosing an encryption solution.