1. Executive Summary¶
The Brain Berry solution is a hybrid platform that correlates information from neural sensors and user actions to alert for the potential of certain illnesses. The following document describes a reference solution architecture running on AWS Services and supporting the Brain Berry application platform. There are five pillars of architectural maturity described in this document; however, data protection and privacy concerns will hold an unequal weight when evaluating decisions on architectural approaches. As such, the security and compliance concerns of the Brain Berry solution will be highlighted within the Executive Summary.
Data Access Controls¶
Article 25 of the GDPR states that the controller "shall implement appropriate technical and organizational measures for ensuring that, by default, only personal data which are necessary for each specific purpose of the processing are processed." The following AWS access control mechanisms will be employed by the Brain Berry solution in order to comply with this requirement.
AWS Identity and Access Management¶
In general, AWS services take a "deny-first" centric approach when determining if an instance of a service can execute an action. IAM controls will be employed throughout the solution to provide each instance of a service with its own IAM role that will give it permissions to do only the job required of that service at that time. For example, if a service needs to write data to storage (e.g. an S3 bucket) that enforces encryption, then that service will be allowed to use the cryptography key for encryption purposes only, meaning that the service can do nothing with the data once it is written to storage. IAM policy controls such as these will extend to users and 3rd parties (e.g. external partners, medical professionals, researchers, etc.).
Multi-Factor Authentication¶
The solution will include two-factor authentication for all users accessing production AWS resources and data. For example, we can define a policy that allows the use of a KMS key for decryption purposes only if the MultiFactorAuthPresent condition exists.
Defining Boundaries for Regional Services Access¶
The solution allows us to choose the AWS Regions in which data is stored. This allows us to deploy AWS services in the locations of our choice in accordance with specific geographic requirements. For example, if we want to ensure our content is located only in Europe, we can choose to activate and deploy AWS services exclusively in one of the European AWS Regions (Frankfurt, London, Paris, Ireland, Milan, or Stockholm). We can further enforce these regional access controls while allowing for a global solution by defining sub-AWS accounts with specific region access but with no automatic ability to share data between accounts.
Control Access to Web Applications and Mobile Apps¶
User login and access control features to the Brain Berry web applications, mobile apps, and analytic tools, will use Amazon Cognito. Amazon Cognito user pools provide a secure user directory that scales to hundreds of millions of users. To protect the identity of the users, we can add multi-factor authentication (MFA) to your user pools with no impact to the rest of the solution architecture.
With Amazon Cognito Identity Pools (Federated Identities), we can see who accessed each resource and where the access originated (mobile app or web application). For example, calls to Lambda functions that provide READ access to data can be logged to maintain a record of "who" accessed "what" and "when".
Protecting Data¶
Encrypting Data at Rest¶
Encrypted data will be securely stored at rest and can be decrypted only by a party (user, 3rd party, or AWS service) with authorized access to the Customer Managed Key with AWS Key Management Service. This applies to data in S3, EBS, EFS, and Amazon database services. KMS will allow us to create, import, and rotate keys, as well as define usage policies and audit usage from the AWS Management Console or by using the AWS SDK or AWSAWS CLI. For any Customer Managed Key with KMS, we can control who has access to those keys and which services they can be used with through a number of access controls, including grants, and key policy conditions within key policies or IAM policies.
Encrypt Data in Transit¶
All data will be encrypted via TLS encryption over HTTP and MQTT (for the IoT components).
Monitoring and Logging¶
Article 30 of the GDPR states that "...each controller and, where applicable, the controller's representative, shall maintain a record of processing activities under its responsibility". This article also includes details about which information must be recorded when you monitor the processing of all personal data. Controllers and processors are also required to send breach notifications in a timely manner, so detecting incidents quickly is important. To help comply with these obligations, the solution will leverage the following monitoring and logging services.
Manage and Configure Assets with AWS Config¶
AWS Config provides a detailed view of the configuration of many types of AWS resources in your AWS account. This includes how the resources are related to one another, and how they were previously configured, so you can see how the configurations and relationships change over time. AWS Config will be enabled on all production accounts, and will provide the following capabilities:
- Evaluate AWS resource configurations to verify the settings are correct.
- Get a snapshot of the current configurations of the supported resources.
- Get current/historical configurations of one or more resources.
- Get a notification when a resource is created, modified, or deleted. Possibly provide automatic remediation if a resource is modified, e.g. encryption enforcement turned off of an S3 bucket.
- See relationships between resources. For example, find all resources that use a particular security group.
Compliance Auditing and Security Analytics¶
With AWS CloudTrail enabled, we will continuously monitor AWS account activity. A history of the AWS API calls will be captured, including API calls made through the AWS Management Console, the AWS SDKs, the command line tools, and higher-level AWS services. We can identify which users and accounts called AWS APIs for services that support CloudTrail, the source IP address the calls were made from, and when the calls occurred. CloudTrail will also be used by other services, such as IAM to help with ongoing recommendations of security policies that may be too open based on how users/services have used those policies over time. For example, a user might have a policy that provides the write access to S3 but they have no history of ever using that permission.
Collecting and Processing Logs¶
CloudWatch Logs can be used to monitor, store, and access your log files from AWS Lambda, IoT Core, AWS CloudTrail, Route 53, and other sources. Logs information includes, for example:
- Granular logging of access to Amazon S3 objects
- Detailed information about flows in the network through VPC-Flow Logs
- Rule-based configuration verification and actions with AWS Config rules
- Filtering and monitoring of HTTP access to applications with web application firewall (WAF) functions in CloudFront
Discovering and Protecting Data at Scale with Amazon Macie¶
Article 32 of the GDPR states that:
- the controller and the processor shall implement appropriate technical and organizational measures to ensure a level of security appropriate to the risk, including inter alias as appropriate:
- the ability to ensure the ongoing confidentiality, integrity, availability, and resilience of processing systems and services;
- a process for regularly testing, assessing, and evaluating the effectiveness of technical and organizational measures for ensuring the security of the processing.
To help identify and protect sensitive data, we will leverage Amazon Macie, a fully managed data security and data privacy service that uses pattern matching and machine learning models for detection of Personally Identifiable Information (PII) stored in S3 buckets. Amazon Macie scans S3 buckets and provides a data categorization of them using managed data identifiers that are designed to detect several categories of sensitive data. Macie can detect PII such as full name, email address, birth date, national identification number, taxpayer identification or reference number, and more. We can also define custom data identifiers that reflect the organization's particular scenarios (for example, customer account numbers or internal data classification).
Amazon Macie continually evaluates the object inside the buckets and automatically provides a summary of findings for any unencrypted or publicly accessible data discovered that match with the defined data category. This data can include alerts for any unencrypted, publicly accessible objects or buckets shared with AWS accounts outside those you have defined in AWS Organizations. Amazon Macie is integrated with other AWS services, such as AWS Security Hub, to generate actionable security findings and provide an automatic and reactive action to the finding.
Centralized Security Management¶
- AWS Control Tower provides a method to set up and govern a new, secure, multi-account AWS environment. It automates the setup of a landing zone, which is a multi-account environment that is based on best-practices blueprints, and enables governance using guardrails that you can choose from a pre-packaged list.
- AWS Security Hub centralizes and prioritizes security and compliance findings from across AWS accounts and services, such as Amazon GuardDuty and Amazon Inspector, and can be integrated with security software from third-party partners to help you analyze security trends and identify the highest priority security issues.
- Amazon GuardDuty is an intelligent threat detection service that can help customers more accurately and easily monitor and protect their AWS accounts, workloads, and data stored in Amazon S3. GuardDuty analyzes billions of events across your AWS accounts from several sources, including AWS CloudTrail Management Events, CloudTrail Amazon S3 Data Events, Amazon Virtual Private Cloud Flow Logs, and DNS logs. For example, it detects unusual API calls, suspicious outbound communications to known malicious IP addresses, or possible data theft using DNS queries as the transport mechanism. GuardDuty is able to provide more accurate findings by leveraging machine learning-powered threat intelligence and third-party security partners.
Miscellaneous Concerns¶
While the above points describe the services used to protect and audit customer data within the Brain Berry architecture, it will still be possible to implement processes to share information between regions, accounts, or with 3rd parties. Two examples will help illustrate how we can meet our security, privacy, and functional concerns.
-
Development Environment Data Refresh - Developers and testers will need access to valid data, but they don't need access to accurate data. Data obfuscation can remove private information from data in order for it to be usable in a less secure environment. The proposed architecture can meet this requirement in the following fashion:
-
Use AWS Organizations and AWS ControlTower to establish separate accounts for production and development purposes, i.e. most developers will only have access to the development account which contains no production information.
- In the production account, establish an AWS Data Migration Service task that has access to:
- Read production data into a staging area
- Obfuscate the data in the staging area
- Use the obfuscated data to reload a development database
- Delete the obfuscated data.
- Establish an IAM policy that allows for the execution of the DMS Task to a user that has logged into the production account with an additional MFA factor.
In this scenario, the user doesn't need permission to access the data, they only need permission to execute the DMS Task. While the task itself has permission to decrypt sensitive data, that data will never be visible to actual users until it has been fully obfuscated.
-
EU Researchers Need Access Patient Data - IoT Analytics is one tool that allows for modified, filtered, and delivered to multiple endpoints. The proposed architecture can meet this requirement in the following fashion:
-
Researchers can be given their own AWS account under Brain Berry's AWS Organizations, or they can establish their own AWS Account. If given an account under AWS Organizations, it can just be used for data access, or the researchers can use other AWS Services for data processing.
- IoT Analytics can include rules to filter data sent to the researchers (removing row, columns, or real-time obfuscation)
- Customer Managed Keys supplied by AWS KMS can provide fine-grained policy control such that IoT Analytics can only encrypt data destined for the researchers but not access other data created by the researchers.
- Delivery of PII data can be logged for granular monitoring of "who" has received "what" data.
If GDPR compliance can not be maintained with 3rd parties, the architecture can support monitored access to Brain Berry's account similar to the mechanism described in the Development Environment Data Refresh section.