February 22, 2024

AWS Expands Data Pipelines + Creates Amazon Q Gen-AI Assistant

AI is hungry. In our current age of artificial intelligence (AI), where the new age of generative AI shows a seemingly limitless appetite for vast resources of information, the enterprise technology space never tires of talking about the importance of data and how we manage it in all its various forms . forms.

Because data comes in such a diverse range of structures and forms, we can do a lot with it. This is a good thing, i.e. we want some data to be stored in transactional systems (retail databases could be a simple example); we want some data to be in systems with fast access and low latency, because it is accessed, requested and updated regularly; we want to save money on less frequently used data and use cheaper data stores; we want certain information to be highly organized, structured and de-duplicated (for example because it is related to business-critical applications on the front line); and we can also appreciate the fact that some unstructured data can be funneled into a data lake simply because we can’t categorize every voice recording, video, Internet of Things (IoT) sensor reading, or even documents that may not be needed today. but maybe tomorrow.

Extract, Transform and Load (ETL)

But all this variation in data topography also poses a challenge. When we need to use these information sets together – new applications in AI being an example – we face an access problem. This is where technology architects, database administrators, and software application developers talk about their ETL requirement – ​​an acronym that denotes the need to extract, transform, and load data from one place to another.

NOTE: For the sake of data science completeness, we should also mention that ETL’s sister data integration process and discipline is Extract, Load, Transform (ELT) – the point at which we take raw or unstructured data (such as from a data lake) and transform it. to an organized state for downstream use cases.

Amazon Web Services, Inc. (AWS) of course exists in a universe of databases, data lakes, data warehouses, data marketplaces and data workloads. Eager to leverage its power to create new integration capabilities across the planet’s data pipeline network, AWS has now explained how its new Amazon Aurora PostgreSQL, Amazon DynamoDB, and Amazon Relational Database Service (Amazon RDS) for MySQL integrations with Amazon Redshift will make it easier to connect and analyze transaction data from multiple relational and non-relational databases in Amazon Redshift. Customers can now also use the Amazon OpenSearch Service to perform full-text and vector search functionality on DynamoDB data in near real-time.

Zero-ETL integrations

By making it easier to connect and act on all data regardless of location, AWS calls these technologies “zero-ETL integrations” and promises they will help users experience the depth of its database and analytics services Tapping into AWS.

“AWS offers the industry’s broadest and deepest set of data services for storing and retrieving any type of data at scale,” said Dr. Swami Sivasubramanian, vice president of data and artificial intelligence at AWS. “In addition to having the right tool for the job, customers need to be able to integrate the data spread across their organizations to unlock more value for their business. That’s why we’re investing in a zero-ETL future, where data integration is no longer a tedious, manual effort and where customers can easily get their data where they need it.”

We know that organizations have different types of data, from different origins, at different scales and speeds, and the uses of this data are just as varied. For organizations to get the most out of their data, AWS emphasizes that they need a comprehensive set of tools that take all these variables into account, along with the ability to integrate and combine data spread across multiple sources.

A working example

For example, AWS says: “A company can store transactional data in a relational database that it wants to analyze in a data warehouse, but use another analysis tool to perform a vector search on data from a non-relational database. Historically, moving data has required customers to design their own ETL pipelines, which can be challenging and expensive to build, complex to manage, and prone to periodic errors that delay access to time-sensitive insights.”

That’s why AWS underlines its work in this area, i.e. it has invested in zero-ETL capabilities that remove the burden of manually moving data. This includes federated query capabilities in Amazon Redshift and Amazon Athena – which allow users to directly query data stored in operational databases, data warehouses and data lakes – and Amazon Connect analytics data lake – which allows users to access contact center data for analytics and machine learning. The work here also includes new zero-ETL integrations between Salesforce Data Cloud and AWS storage, data and analytics services to enable organizations to unify their data across Salesforce and AWS.

Hey, remember ETL?

The whole thread of what’s happening here comes down to a theme we see playing out across the enterprise IT landscape: automation. According to G2 Krishnamoorthy, vice president of analytics at AWS, if we can eliminate much (or even all) of the ETL workload that software development and IT operations teams previously had to shoulder, then we’re putting the ETL function in a space where it becomes a utility.

G2 Krishnamoorthy says this will make not only the software engineering team happy, but also everyone who needs to access data from the huge variety of sources we’ve depicted here. Could that lead to a time where software engineers sit back and joke – hey, remember ETL? Okay, it’s not a great joke, but it’s a funny one.

Enter… Amazon Q

A new type of generative AI assistant is also currently emerging from AWS. Known as Amazon Q, this technology is built specifically for work and can be tailored to a user’s own business requirements across organizations. So (as we often say), what is it and how does it work?

AWS is positioning Q as a means to provide users of all types with a tool to get quick, relevant answers to important work (and potentially life) questions, generate content, and take action. How does it work? It draws its knowledge from the customer’s own information repositories, software application code and business systems. It is designed to streamline tasks and speed up decision making and problem solving.

Built to deliver on what AWS promises, namely enough robustness to support the rigorous demands of enterprise customers, Amazon Q can personalize its interactions for each individual user based on an organization’s existing identities, roles, and permissions. Because intellectual property (IP) concerns are always close at hand in this area, AWS says Amazon Q never uses enterprise customer content to train the underlying models. It provides gen AI-powered assistance to users who build on AWS, work in-house, and use AWS applications for business intelligence (BI), contact centers, and supply chain management.

“AWS helps customers leverage generative AI with solutions at all three layers of the stack, including purpose-built infrastructure, tools and applications,” said Dr. Swami Sivasubramanian, vice president of data and artificial intelligence. “Amazon Q builds on AWS’s history of taking complex, expensive technologies and making them accessible to customers of all sizes and technical capabilities, with a data-first approach and enterprise-grade security and privacy built in from the start. By bringing generative AI to where our customers work – whether they’re building on AWS, working with internal data and systems, or using a range of data and business applications – Amazon Q is a powerful addition to the application layer of our generative AI stack that opens new possibilities for every organization.”

AWS seems to cover a lot of bases, but that’s AWS. With so many cloud tools to choose from (some smaller companies use just a handful, but larger customers like those in the automotive industry use the entire AWS toolbox), it’s almost difficult to figure out which parts of the AWS stack work for each type of works. user base. Conveniently, Amazon Q could also help answer that question, i.e. we know the best way to combat AI-powered malware is with AI-powered vulnerability assessment and scanning tools, so the best way to tackle the complexity of the enterprise cloud can certainly also be combated with AI.

Amazon Q is available in preview to customers, while Amazon Q in Connect is generally available and Amazon Q in AWS Supply Chain is coming soon. Users must form a line… and get in line for Amazon Q.

Leave a Reply

Your email address will not be published. Required fields are marked *