Recently, AWS announced a preview of new features for AWS Glue. This enables customers to use natural language to create and troubleshoot data integration jobs. Amazon Q Data Integration with AWS Glue allows a developer to provide a description of a data integration workload, and the service generates an ETL script for her.
Generating AI A new chat experience in AWS Glue, powered by Bedrock’s managed services, brings natural language processing to ETL with the goal of simplifying the creation and troubleshooting of data integration jobs. Irshad Buchh, principal advisor at AWS, explains:
You write a data integration workload, and Amazon Q generates a complete ETL script. You can troubleshoot your job by asking Amazon Q to explain the error and suggest a solution. Amazon Q provides detailed guidance throughout your data integration workflow. Amazon Q helps you learn and build data integration jobs using AWS Glue. Amazon Q helps you connect to popular AWS sources such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon DynamoDB.
Introduced in preview at re:invent, Amazon Q acts as an AI-powered generative assistant that helps developers and customers “solve problems, generate content, and take action.” Designed to simplify ETL pipeline development, AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and aggregate data for analytics, machine learning, and application development.
Recent announcements highlight the following prompts:
Write a Glue ETL job that reads from Redshift,
drops null fields, and writes to S3 as parquet files.
It will walk you through the necessary steps and generate the Python code, as shown in the following screenshot. This code can be further customized and transferred to a script editor or notebook.
Source: AWS Console
Extract, transform, and load (ETL) processes have become important in recent years to manage structured and unstructured data from a variety of sources, including marketing, customer, and sensor data. Improve business intelligence and analysis. Bala Balakumar, Director of Waka Online NZ commented:
Amazon Q is very powerful in terms of working across disparate data sources. With AWS Glue and Amazon Q integration and Zero ETL, you can now get instant data insights.
Amazon Q data integration isn’t the only recent enhancement to AWS Glue. Glue Data Catalog supports creation, management, and access control of multiple engine SQL views. Glue Data Quality provides anomaly detection and insight and tracks how your data changes over time. Finally, just before re:Invent, the cloud provider added Glue serverless Spark UI and observability metrics.
Amazon Q Data Integration is available in all regions where AI-powered assistants are currently supported. According to the documentation, no matter where a customer uses the generative AI-powered assistant, the data will be sent to and stored in his AWS region in the US.