Serverless – Consultadd Technology

AWS Partner Story: Pinellas County Human Services – Helping the ones in need

Shagun Tandon — Tue, 07 Aug 2018 07:50:27 +0000

About Company

Pinellas County Human Services (PCHS) has been creating solutions for a stronger community by serving those most in need since 1955. With a network of over 105 partner agencies and managing 190 plus contracts and grants, Human Services helps Pinellas County residents obtain access to medical care, emergency financial assistance, help connect to county judicial resources, optimize benefits for Veterans and Dependents, investigate consumer complaints, and help those who are experiencing homelessness.

Pinellas County Human Services has partnered with the Pinellas County Department of Health and the Turley Family Health Center to provide prevention-focused health care to eligible Pinellas County residents. The Pinellas County Health Program moves clients from a “sick care” model toward a “disease management” model using medical homes.

Executive Summary

Pinellas County’s main requirement was to have a Data ingestion solution that could collect the data from 40+ various data sources having data in different schemas and formats, align the data in common convention so that it can be consumed by other services and pipelines to perform reporting and analysis. Overall, they need a data lake architecture for reporting and analysis.
The required data ingestion pipeline needs to be fail-proof to not miss any incoming data and should be capable of handling data files with thousands of records. As we are dealing with personally identifiable information so the data should always be encrypted in-flight and at rest to be HIPAA compliant.

Challenges

• PCHS needed a solution that would enable them to capture, analyze, and visualize analytics data generated from a variety of data sources with varying access protocols and structures, providing insights to help the departments to provide better health services and medical policies.
• Pipeline should support thousands of records uploaded every month from different Hospitals and Medical homes servers.
• Also, it had to protect information with encryption in transit and at rest. It also scales elastically to handle peak loads.
• HIPAA-Compliant for sensitive patient data protection.

Benefits

• Eliminates the need to provision and manage infrastructure to run each microservice.
• Lambda automatically scales up and down with load, processing millions of data points monthly.
• Cost Saving.
• HIPAA-compliant solution.
• Giving faster insight into data.
• Speeds time to market for new customer services, since each feature is a new microservice that can run and scale independently of every other microservice.
• Decouples product engineering efforts from the platform analytics pipeline, enabling the creation of new microservices to access data stream without the need to be bundled with the main analytics application.

Partner Solution

Before implementing the solution, we needed to find answers to the below questions:
• What will be the frequency of the incoming data?
• How much quantity of data would be there for filtering at the initial stage?
• What kind of data formats could be required to process?
• What will be the hotkey/partition key for the data while querying it for reporting and analysis?

Our Solution

Based on the above requirements, we used Lambda functions, S3 bucket, DynamoDB, and SQS along with KMS for encryption, to keep our architecture to be entirely serverless. Having this serverless architecture helps to reduce cost and achieve higher scalability without worrying about infrastructure management.

Lambda functions are capable enough to handle the amount of data that we collect from several sources on monthly basis. Using Python libraries like Pandas and NumPy helps us transform the data very efficiently.

S3 is best for data lake solutions as it can hold data in almost any possible format and by using storage classes, also we reduced the cost of storing the data for the long-term using storage classes and lifecycle events.

For metadata management we have used DynamoDB, it stores our pipeline metrics that help our lambda functions to take decisions during runtime, also it tracks the history of datasets coming from source systems.

To handle pipeline failures, we have used the SQS service as a Dead letter queue that will hold messages and trigger notification services on pipeline failures.

As we must deal with Personal Identifiable Information (PII) we must keep our architecture to be secured under a reliable roof, for which we have used AWS GovCloud. GovCloud safeguards over sensitive data files which makes our solution to be HIPPA compliant.

Using S3 bucket to collect and store the data from Source

Every partner hospital first uploads their data files from their secured networks to our given S3 bucket using Transfer Family service which is highly secured for B2B file transfers. Once they upload the data file to the S3 bucket, the data file will automatically be encrypted using SSE-KMS, using SSE-KMS will keep the data encrypted at rest with encryption keys managed by KMS. On the S3 file upload event, the lambda function will be triggered. We configured S3 Bucket policies to forcefully allow connection over HTTPS which helps to keep the data to be encrypted at transit, also it will regularize the access to only upload the files from hospital networks which restrict them to looking for any other data that doesn’t belong to them.

To increase the querying efficiency in the S3 bucket we have partitioned the data based on the source name as lambda functions used further can easily retrieve the files.

In our pipeline we have used to S3 bucket, first one is the Ingestion bucket which is used to store the data directly coming from source systems, all the lambda functions will keep an eye over the Ingestion bucket to fetch the datafile for processing once it is uploaded by any partner hospital. The second S3 bucket is Raw data bucket that we have used to keep the filtered data that is processed during the ingestion phase by lambda functions, Raw bucket stores the data in parquet format which uses columnar data storage technique and make data to be queried faster, all datasets in Raw bucket are into the similar schema. we only need to look upon only past 6 months of data frequently for reporting and analysis, so with the help of S3 lifecycle policies, we migrate the data to S3 infrequent access storage class after 6 months and then to S3 glacier storage after 24 months. Which helps us to reduce the storage cost in managing historical data.

Processing the datasets

After the file has been successfully uploaded to the Ingestion S3 bucket, our first lambda function i.e., “handle_s3_event_ingestion” will be triggered from the pipeline, this lambda function will receive a payload from invoking event based on which it will take decisions to generate the metadata and invoke next lambda functions in the pipeline. It uses Python 3.7 as its runtime environment. It is configured to handle concurrent executions in case of multiple hospitals upload data files simultaneously. Its main business logic includes the task to generate payload and pass it to the next lambda function from a set of several lambda functions in the pipeline by looking onto the data set source and format types.

Next, we have a set of several lambda functions (“semi_structured_event_workflow_*”) out of which one will be invoked. Every lambda function from the set has a business logic to filter and transform a specific source dataset with a specific data format. Filtering and transformation include operations like removal of duplicate data, normalization of data, map correct data types, addition or removal of certain columns, and some other schema changes. To use third-party libraries, we are using our custom lambda governance layer which includes our predefined functions and classes to use specific functionalities from third-party libraries, this architecture of using lambda layers helps us to achieve abstraction and d gives an extra layer of security over the data. For reading the datafiles from the S3 bucket we have used the Boto3 library as it allows communication with S3 over HTTPS protocol and keeps data encrypted in transit.

Every lambda function will interact with DynamoDB to fetch and update the metadata for the current processing data files, DynamoDB stores information like, dataset name, last run status, fields, and its datatypes, timestamps, S3 storage source and target locations, also some information for related operations that needs to be performed.

After filtering the datasets, some records with missing fields/data need to be validated, for which another lambda function (pii_matching) will be triggered in the pipeline. It will take bad records and dataset names as input from the previous lambda function and perform records matching operations to validate them. “pii_matching” lambda function searches for every bad record in the aurora database and based on config files passed during runtime it will check for exact and proposed matched records. Config files include the fields and operators that need to be applied for searching the record in the database. For all the exact match records we update the dataset in the S3 bucket. But for all the proposed match records we create new entries in the database to manually validate them from data scientists through a web application.

For each record, we need to fire queries to the database which increases the load and results in higher execution time. To resolve this, we used multithreading in the lambda function which takes every record as a single thread and processes it. By using multithreading, we reduced the execution time for lambda function to few minutes and helped reducing the cost.

To manage database credentials, we have used secret manager service where every credential is kept encrypted using KMS and restricted the access by using proper IAM policies and roles.

Handling Pipeline Failures

One of a challenge in the pipeline was handling lambda execution failures, which might be caused due to any reason during runtime. So, we used failure on retries and SQS as a Dead letter queue (DLQ) as a solution for it. Whenever any lambda function fails during execution it will first retry for 3 times to resolve any dependency issues if it still fails on retries, a message including all the details like dataset name, lambda name, error statement, timestamp, etc. Will be generated in DLQ which will be configured as a destination of lambda on failure. Later DLQ will be polled from any other service to send notifications.

Deployment Maintenance

For this pipeline, our deployment process is automated through Gitlab pipelines as we are managing our code repositories using Gitlab. For infrastructure deployment, we have used terraform scripts as Infrastructure as a code. We have maintained two separate environments, one for development and one for production use. Using Gitlab’s automated CI/CD pipeline we have triggers on code lookup and deployment.

AWS Partner Story: Spencer Stuart

ca — Tue, 07 Aug 2018 03:01:29 +0000

About Spencer Stuart

Spencer Stuart is a global leader in executive search, board and leadership consulting, advising organizations around the world for more than 60 years. Spencer Stuart has 60 years’ experience in leadership consulting. They built a reputation for delivering real impact for our clients — from the world’s largest companies to startups to non-profits. Spencer Stuart is responsible for producing industry and/or functional analysis reports in support of search engagements, internal meetings, and new business initiatives. This includes competitive information, trends across sub-sectors and target company list development.

Executive Summary

Spencer Stuart’s main objective was to have a serverless automation to detect and free up the workspaces that are not being used by their clients. The requirement includes continuous look up on workspaces in a periodic manner, the architecture should be fault tolerant and secured. Based on the information we get from the workspaces; we must generate reports that will give the broader picture about the resources that are under use by the clients. Our solution should also be able to track and maintain the historical records. We suggested to use lambda functions to do most of the heavy lifting in our architecture along with some other AWS services like S3, Event Bridge, SQS, SNS, etc. With the help of these services, we were able to achieve serverless architecture and leverage its benefits. Using Event bridge with Cron job expressions, we triggered a lambda function to keep an eye on the workspaces that are currently active and those which are not active from past 14 days. After collecting the data, lambda function will take the decisions to terminate the workspaces. During this entire process lambda function interacts with S3 to read and write the data for lookup. To prevent our system from failed lambda executions we have used SQS as a Dead letter queue which gives us the benefit to easily notify and resolve lambda failure issues with minimal downtime.

Challenge

Spencer Stuart had majority of security concerns regarding their data privacy hence they allowed the access to their majority of the applications with AWS workspaces only. But it also raised an increasing concern regarding inactive workspaces, not only they contributed, but the user also had the access to workspaces even after they leave the organization. Due to which company came up with the policy that all the workspaces that have been inactive for more than 14 days shall be deleted. Also, a detailed report of the deleted workspaces needed to be published on confluence so the managers can be intimated in case any false positive is there.

Benefits

Eradicates the risk of outsiders accessing the confidential data and applications that are only accessible of workspaces.
Automates the infrastructure and removes the need of manual monitoring
Saves costs due to inactive but still running workspaces.
Gives a detailed Analytics of workstation activity

Partner Solution

Spencer Stuart’s main requirement was to create an automated solution-based mechanism which provisions and deletes the workspace according to the inactivity. The required resource deletion needed to be on regular basis without any failure and continuous reporting needed be done

Based on the above requirements we made the use of AWS Lambda to Query about workspaces, S3 bucket for reporting, file hosting and data persistence. Event Bridge to generate cron job and call lambda execution at regular intervals. We made the use of SQS Dead Letter Queue in order to maintain a good disaster management system and retrigger lambda in case of any failure and along with that an SNS subscription to send messages

EventBridge for Cron Jobs

Event Bridge Rules was used to run on a predefined schedule. Using this feature of AWS EventBridge, you can use it as a cron job scheduler- Weekly in our case. EventBridge is a highly efficient and economical solution for projects of all scale, and has very high accuracy, with a latency of just 0.5 seconds.

AWS EventBridge has following advantages over traditional methods for scheduling cron jobs:

Jobs can be run with AWS Lambda, which uses cloud computation. Thus, performance of servers is not affected at all.
No polling is required to check for scheduled jobs.
Scheduling jobs is very easy. AWS provides direct APIs to schedule jobs.
Jobs which require a lot of time can be executed easily without worrying about overlap.

Reporting and Termination of Workspaces

We made the use of Python 3,7 runtime and AWS sdk Boto3 to describe workspaces and their activity. This would return us the user, email address and last activity of that particular workspace.

After we query out all the workspaces, we used to filter those having more than 14 days of inactivity. We also used to figure out all the workspaces and find the workspaces that are potentially at risk of getting deleted in the next execution and notify the emails regarding the same.

Also, the detailed report CSV file used to be put on S3 bucket which was directly connected to Confluence which provided a list of deleted workspaces and Workspaces potentially can be deleted in the next execution.

Handling pipeline failures.

One of the challenges in the pipeline was handling lambda execution failures, which might be caused due to any reason during runtime. So, we used failure on retries and SQS as a Dead letter queue (DLQ) as a solution for it. Whenever any lambda function fails during execution it will first retry for 3 times to resolve any dependency issues if it still fails on retries, a message including all the details like error statement, timestamp, etc. Will be generated in DLQ which will be configured as a destination of lambda on failure. Later DLQ will be polled from any other service to send notifications with the help of SNS which has already subscribed to those SQS messages.

Deployment and Maintenance

AWS Lambda functions of Python using CodeCommit, CodeBuild, and CloudFormation, AWS SAM templates and orchestrated by CodePipeline.

The infrastructure was mainly put up by CloudFormation and AWS Serverless Application Model templates to put up the serverless code. CodeBuild was used compile source code using this template and output new template file for deploy as CloudFormation stack. Source code was deployed by CodePipeline automatically in two environments – Dev and the Prod as soon as it was merged to the ‘main’ branch.