We have previously laid out a four step framework for data-centric security in a recent Forbes article and at more length here on our own blog. In this article, we discuss the details of Step 3 in that framework: Control & Protect (your data), and how TrustLogix can help.
Access control policies give organizations confidence that the right people are accessing the right information for the right reasons. However, implementing consistent data access control policies across multiple platforms is a time-consuming and costly task for multiple reasons:
- Each data platform has a different mechanism to define access controls.
- Data Owners and Stewards lack expertise in implementing policies across disparate data platforms
- They therefore need to involve system owners that are more intimately familiar with the controls native to their systems
- There is typically no way to ensure that policies have been modeled and are being enforced correctly across disparate platforms
- This manual process of coordination between different teams delays access to data
Let us go deeper and look at different scenarios in lifecycle of data from ingestion to consumption and how TrustLogix helps in these scenarios.
Data Ingestion
Enterprises are modernizing their data ecosystems by bringing various types of data from pre-existing databases, legacy data warehouses, and data streaming services into a Data Lake. Data integration tools like AWS Glue ETL are extensively used as data pipelines to build Data Lakes.
Moving large quantities of data including PII, PCI and other sensitive information via these pipelines creates numerous security challenges. Data Owners are responsible for making sure that privacy regulations and security policies are correctly followed and that is difficult to do when their data is being pumped into heterogeneous cloud data platforms.
TrustLogix Secures the Data Pipeline
TrustLogix addresses these challenges with multiple capabilities that safeguard organizational data as it is piped to the cloud.
- Remove the data in the payload by Masking
- Anonymize or Tokenize the data
TrustLogix enables you to build security into your Data Operations process. The TrustLogix Trustlet is our serverless component that gets built into the ETL process to apply security and privacy policies onto your data in the data pipeline. It integrates with data classification tags and then applies security transformations on specific attributes. Only the transformed data is then pushed to the destination system.
Data Consumption
The data in the data lake is consumed by various other data platforms to perform data analysis, run machine learning algorithms, and provide data visualization. These tools are used by people in different roles and the data they are permitted to access should be limited only to what is appropriate as defined by the enterprise security and business rules. These policies should be consistently enforced across the disparate platforms used in the organization. Importantly, the application and enforcement of those policies should not slow down the business and data consumers should be able to get access to the appropriate datasets without delay.
Coarse-Grained Access Control
Data Platforms like Snowflake, Redshift, and Google BigQuery provide rich Role Based Access Control (RBAC) policies. As a security best practice, customers should create a combination of data access roles with different permissions on objects and assign them as appropriate to business roles.
TrustLogix enables consistent policy creation through a single pane Control Plane UI to define these RBAC policies. For example, the Data Engineering team requires select access on the Customer object in Snowflake. Data Stewards can define this policy in one central place in TrustLogix and we convert that into a native Snowflake RBAC policy and deploy it seamlessly into the Snowflake environment.
Fine-Grained Access Control
Fine grained access control is the ability to grant access to a very specific set of data from a large dataset. This is required in large enterprises when different data consumers have different job functions and local regulations and business rules restrict their access to very specific data.
TrustLogix empowers organizations to define fine-grained access control policies that ensure only the correct people have access to the appropriate data.
Row Access Policies & Column Masking Policies
For example, in an investment bank that uses Snowflake to perform data analysis on their confidential market data, only a Market Analyst assigned to work with the Fintech sector can access that sector's data. To define this specific business policy, Data Stewards simply choose a single table policy and define the data restrictions that need to be enforced.
Another typical requirement for Fine Grained Access Control is to protect sensitive data columns from being exposed to all users. In the above example, if there is a column with proprietary data that should be masked, Data Stewards need only define the columns that should be excluded or masked from the query result.
TrustLogix converts these two business constructs into a Native Snowflake Row access policy and column masking policy (as constructed in the below picture). TrustLogix simplifies the dependency between Data Operations and Data Stewards and speeds up the process of granting access to data in a compliant manner.
Data Entitlement Policies
Entitlement based access policies are a variation of Fine Grained access controls where business rules defined in an external system control which sets of users get access to what data.
Continuing our banking example, brokerage advisors are assigned to a specific set of customers. This assignment is generally based on the customers’ relationship level with the firm. Customer relationships and advisor assignments are maintained in a separate system. In these cases the Cloud data platforms are not aware of these external data to enforce data access policies.
TrustLogix addresses this gap for Data Stewards by providing an integration with these external objects and converting them into a native data platform policy.
Data Tags Based Policies
Data Tags enable data owners and stewards to track sensitive data for compliance, discovery, protection, and resource usage. Creating access policies based on tags simplifies data security governance.
Finishing our banking example, there are multiple objects in their data warehouse that contain proprietary data and are tagged as Confidential. Data Owners need to grant access to this data only to users with the Sensitive Data Access role. To simplify the administrative process, TrustLogix integrates with tags and data classification constructs in various data platforms such as Snowflake and Collibra. Data owners can create one policy to grant Confidential data using those tags. TrustLogix iterates through the data warehouse to find all the objects tagged as Confidential and automatically creates appropriate grants .
Summary
Enterprises require a multi-pronged “defense in depth” strategy at every layer. This starts with ensuring that sensitive data leaves the source system only with proper controls and has consistent access controls in various Data consumer platforms. At the same time these controls should not add overhead between various stakeholders. Enterprises require a solution like TrustLogix that works natively with multiple data platforms and simplifies data security governance.