Efficient Data Classification For Successful Cloud Migration

Data can be both an asset and a risk for organizations. Companies that fail to properly manage their data are not always aware of the fact this may mean compromising their own success.

This is true for on-premises environments in which enforcing data governance and data classification policies are simple as the willingness of the privacy, data and IT functions to collaborate (which, as you may know, is not that simple at all). But nowadays, organizations have to add on top of that - the cloud context.

Cloud adoption by its own nature is data-driven. The assumed risk of cloud service providers interacting with a company’s data in any capacity can lead to companies implementing prescriptive security safeguards that don’t quite fit the organization’s context, and this results in cloud strategy objectives not being met.

Knowing which data requires protection is a prerequisite for any successful cloud strategy, and this means making an investment in categorizing data - since this is the true compass for data protection.

Data classification Standards and Regulations

When turning to standard bodies such as ISO or NIST for cloud strategy guidance, companies must balance the intent of standards and their business objectives. For example, the ISO 27002 standard states that the classification of information should have a protection level in relation to its criticality to the organization and the NIST developed FIPS process focuses on securing data according to confidentiality, availability, and integrity.

To further complicate the matter, standard-based organizations and cloud service providers have defined their own data classification categories. A multitude of examples such as “High”, “Medium”, and “Low”; “Tier 1”, “Tier 2”, “Tier 3”; “Public”, “Internal Use”, “Restricted Use”, “Sensitive”; or a scale of 1 to 5 may be implemented to classify data.

The need to sustain regulatory compliance for GDPR or CCPA further expands the concern to properly and consistently manage and classify data.

Classifying data with context - the data life cycle

When a company adopts external standards, then the human element of subjectivity, experience, and interpretation of the standard must be considered. A company may define a data classification process that is understood and executed differently depending on the party performing the work. Imagine the nature of data which can potentially be marked as "secret" by the Finance department. The same data may be perceived differently by the IT department.

Context may also influence the classification of data, such as when an "internal" document , like the result of a legal issue, becomes subject to more stringent controls ("legal hold"). Moreover, historic documents have, most likely, already been distributed outside of the corporate network - then, it becomes even more difficult to manage its change in classification. Routine business operations may also change a document’s status, such as annual sales projections, changing from a draft state in October to a read-only classification in January.

Managing Cloud meta-data in the DevOps era

Unfortunately, information asset owners are not always fully aware of the data they hold or even where it resides. The rapid nature of CI/CD within DevOps practices changes the speed at which businesses must respond to data classification concerns.

Increased business agility through cloud adoption can promote greater integration of data between disparate sources (suppliers, insurers, cloud service providers, etc..) to deliver a single view of a service, a hybrid cloud approach with data stored across multiple cloud service providers, or even the aggregation of data in cloud-based backup-systems. This means that a company's data has the inherent risk to be exposed to multiple entities. Therefore, data governance and data protection are tied to an effective data classification process.

learn more about kindite's solution

Risk-minded data classification

Simply classifying data does not make it secure or protected. It is what you do with the data that decides the information's degree of protection. When assigning the importance of information, it is valuable to ask why it is important and what would be the consequences of a data leak and loss of confidentiality. Data classification criteria should be easy to understand and yield consistent results. Excluding the user from the process potentially offers more opportunities for classification because the process can work faster, with a wider variety of data types and locations. Automated classification provides a classification process that does not include the user. Such tools will either scan or intercept data, apply algorithms to ascertain the categorization of the data, and record the outcome in some way. Automated classification tools appear in many forms and markets: DLP, file analysis, CASB and ECM systems, among others.

However, companies must assess if a tool’s proprietary format can limit interoperability or cause lock-in to a particular vendor because the portability of the metadata is limited.

Conclusion and recommendations

When bearing/keeping in mind the risk of data leaks, the perpetually expanding volume and propagation of unstructured data and the transient nature of data during its lifecycle, then the highest level of security across all data points seems like the best safeguard. But risks attributed to data cannot be eliminated, therefore they must be managed. One key point to remember is that data classification processes were defined as guidance and recommendations instead of inflexible prescriptive standards to follow. Therefore, a well-defined cloud adoption strategy should be tied to an appropriate data classification process with business needs in mind, as well as consider the following recommendations.

•Make users active in the data protection process to help define a framework that works for the entire business

•Establish data classification rules following a reduced, and not subject to interpretation, set of classification tiers

•Define data profiles that account for the business value, its physical location, and its interface boundaries

•Leverage cloud services to support data classification. For example, Amazon Macie can help customers inventory and classify sensitive and business-critical data stored in AWS.

•Implement tagging of cloud assets and labeling with automation tools

•Apply column based encryption to data where necessary

•Understand the business risk of non-compliance to regulations

To find out more about Kindite's end to end encryption solution click here

Demo_1584-1056_01 (4)