Artificial Intelligence (AI) Services
Introduction (Lesson 173)
| Service | Description |
|---|---|
| Transcribe | Converts speech to text. |
| Translate | Translates text between languages. |
| Comprehend | Analyzes text to discern intent and context. |
| . | Sentiment analysis, Custom text classification, Document grouping by topics, Extracting medical information. |
| Polly | Text-to-speech service. |
| Lex | Powers conversational interfaces like chatbots. |
| . | Allows both voice and text-based communication. |
| . | Underlies Amazon's Alexa consumer products. |
| Rekognition | Analyzes image and video content. |
| . | Identifies objects, people, facial expressions, and inappropriate content; tracks people across frames. |
| Textract | Extracts text and data from documents. |
| . | Works with scanned images, PDFs, and more. |
| DeepLens | A deep learning-enabled video camera for developers. |
| . | A tool for building and testing vision-enabled applications. |
Amazon Transcribe (Lesson 174&175)
Steps to Transcribe Audio
- Login to AWS Management Console, look for Transcribe service.
- Real-Time Transcription Option: Allows for instant transcription via computer mic.
- Transcription Jobs: Allows batch transcription, utilizes stored media in S3 for transcription.
- Upload the sample audio file in an S3 bucket.
- Create Transcription Job:
- Name the job (e.g., "xgboost sample").
- Input file location: S3 path of the audio file.
- Format: WAV.
- Output data: Amazon default.
- Review Transcribed Text: Analyze the generated text, including confidence scores and timestamps.
Initial Results
- Transcription Quality: The default transcription significantly altered the content's meaning.
- Concerns: Highlighted issues with accent interpretation by the service.
- Request and Response Example: Demonstrates application integration for transcription.
Addressing missed words and phrases in the transcription process.
- Vocabulary List: Text file with words for Transcribe to detect.
- Formatting:
- Hyphens for phrases (e.g., "Los-Angeles").
- Dots for acronyms (e.g., "F.B.I.").
- Example: Adding "X.G.-Boost" for "XGBoost".
- Upload Vocabulary Files to S3: Place
VocabularyList.txtin an S3 bucket. - Create Custom Vocabulary in Transcribe Console:
- Name the vocabulary (e.g., "xgboost-list").
- Select the corresponding file from S3.
- Transcription Job with Custom Vocabulary:
- Create a copy of the original transcription job.
- Select the newly created custom vocabulary.
- Analyze the improved output.
- Vocabulary Table: Provides additional pronunciation and display information.
- Columns:
- Phrase: Word/phrase to recognize.
- IPA: International Phonetic Alphabet notation.
- Sounds Like: Breakdown of word pronunciation.
- Display As: Desired output format.
- Usage: Specify either IPA or Sounds Like, not both.
Amazon Translate (Lesson 176)
- Supports real-time and batch translation.
- From AWS management console, navigate to Amazon Translate.
- Real-Time Translation:
- Process: Input source text and select target language for translation.
- Auto-Detect: Feature to automatically identify the source language.
Amazon Comprehend
- Analyzes text data to extract insights.
- Capabilities: Sentiment analysis, custom classification, medical information recognition, syntax analysis.
Lab-1 (Lesson 178)
- Accessing Comprehend: Navigate to the Comprehend service in AWS Management Console.
- Real-Time Analysis: Use the console for immediate text analysis.
- Sample Text Analysis:
- Example Text: Analysis of XGBoost and Kaggle-related text.
- Detected Entities: Categorizes "XGBoost" as an organization with varying confidence scores.
- Sentiment: Neutral with slight positive bias.
Amazon Comprehend Capabilities Overview
- Entity Recognition: Identifies entities like organizations, cities, dates, and provides confidence scores.
- Key Phrase Detection: Extracts important phrases or topics from text.
- Language Detection: Automatically identifies the language of the text.
- Sentiment Analysis: Determines the overall sentiment (positive, negative, neutral, mixed).
- Syntax Analysis: Breaks down text into parts of speech.
Medical Text Analysis
- Utility: Extracts meaningful information from unstructured health records.
- Example: Analysis of a simple medical text.
- Results: Categorization into diagnosis, medicine, dosage, frequency, etc.
- Note: Possible permission errors, retry or contact AWS support if persistent.
Pricing Comprehend (Lesson 179)
- Custom Comprehend APIs: Used for training custom NLP models for text categorization and entity extraction.
- Asynchronous Inference Requests:
- Charged per 100 characters.
- Minimum charge: 3 units (300 characters) per request.
- Model Training Charges:
- $3 per hour, billed by the second.
- Model Management Fee:
- $0.50 per month per custom model.
- Synchronous Inference Requests:
- Requires provisioning an endpoint with specified throughput.
- Charges accrue from endpoint start time to deletion.
- Pricing details are available on the Amazon Comprehend Pricing Page.
Lab-2 (Lesson 180&181)
- Create a classifier to identify tweets requiring follow-up using Amazon Comprehend.
Data Preparation Steps
- Use SageMaker notebook instance.
- Attach
AmazonS3ReadOnlyAccesspolicy for S3 access. - Retrieve dataset from the specified S3 bucket.
- Analyze the dataset comprising 10,000 tweets with 45 columns.
text(tweet content) andtraining label(follow-up required or not).
File Format for Comprehend
- CSV Format: Label in the first column, followed by the tweet text.
- No Header: Ensure no headers are included in the file.
- File Creation: Two versions of the test file - with and without labels.
Building a Custom Classifier
- Objective: Classify Twitter tweets requiring follow-up using Amazon Comprehend.
- Data: AWS-provided Twitter dataset with regular and follow-up tweets.
Steps to Build the Classifier
- Access Comprehend Management Console: Ensure the same AWS region as the S3 bucket.
- Custom Classifier Creation:
- Name:
Twitter follow up. - Language: English.
- Training Data: CSV file from S3 containing tweets and labels.
- Role Permission: Create an IAM role for Comprehend to access S3 files.
Batch Prediction
- Name:
Twitter test. - Analysis Type: Custom Classification.
- Classifier:
Twitter follow up. - Test Data: Specified from S3.
Amazon Polly (Lesson 182)
Introduction to Amazon Polly
- Purpose: Converts text into lifelike speech.
- Applications: Building speech-enabled products and applications.
- Capabilities:
- Real-time and batch audio stream generation.
- Variety of voices and languages.
- Customizable speech synthesis using SSML (Speech Synthesis Markup Language).
Hands-On Lab with Amazon Polly
- Accessing Polly: Navigate to Polly in the AWS Management Console.
- Text Input: Use a standard example text from previous lectures.
- Voice Selection: Experiment with different voices and accents.
- Engine Types:
- Standard: Basic text-to-speech conversion.
- Neural: Enhanced quality for more lifelike speech.
Customization and Quality Improvement
- SSML Markup: Customize aspects like speaking style, pronunciation, and speed.
- Quality Enhancement: Necessary to refine speech for a more natural flow.
Amazon Lex (Lesson 183)
Introduction to Amazon Lex
- Purpose: Build conversational interfaces using voice and text.
- Integration: Combines speech recognition, natural language processing, and business logic.
- Usage: Powers Amazon Alexa and other consumer products.
Key Features of Lex
- Speech Recognition: Converts speech to text.
- Natural Language Processing (NLP): Understands user intent.
- Lambda Integration: Triggers relevant business logic.
- Communication: Responds with voice or text.
Understanding Lex with Hotel Booking Example
- Utterance: User message expressing interest in booking.
- Intent: Lex invokes an intent (e.g., book a hotel).
- Slots: Collects additional information (e.g., city, dates).
- Fulfillment: Completes the booking with all required data.
Hands-On Lab with Amazon Lex
- Access Lex Console: Navigate to Lex in AWS Management Console.
- Create a Bot: Example bot for booking trips.
- Utterances and Intents:
- Car Booking: Recognizes specific phrases for booking cars.
- Hotel Booking: Recognizes phrases for booking hotels.
- Data Collection: Defines slots for required information (city, dates, car type, etc.).
- DataType Specification: Sets data types for each slot (e.g., city, date).
- Confirmation Prompt: User can confirm or cancel the booking.
- Lambda Integration: Optional for business logic execution.
Amazon Rekognition (Lesson 184)
Introduction to Amazon Rekognition
- Purpose: Analyzes images and videos to identify objects, scenes, activities, and more.
- Applications: Object detection, facial recognition, content moderation, celebrity identification, text extraction.
Hands-On Lab with Rekognition
- Access Rekognition: Navigate to the Rekognition service in AWS Management Console.
- Object and Scene Detection: Analyze images to identify objects and describe scenes.
Image Analysis Examples
- Automobiles and Sports: Detected various vehicles, person, skateboard, and associated the scene with sports.
- City Skyline: Identified urban environment, city, highrise.
- Warplanes and Bombers: Recognized aircraft, misinterpreted as warplanes.
- Cat: Precisely identified breed as Abyssinian cat.
Features
- Image Moderation: Safe Image. No moderation labels detected.
- Facial Analysis: Identifies attributes like gender, age range, and expression for each person in a photo.
- Celebrity Recognition: Quick and accurate identification of celebrities.
- Face Comparison: Compares faces for similarity in different images.
- Text Detection in Images: Extracts text from images, identifying phrases, fonts, and license plates.
- Video Analysis: Detects people, objects, activities
Conclusion
- Caution: AI conclusions can be unexpected, stressing the need for human supervision.
Amazon Textract (Lesson 185)
- Extract text, forms, and data from scanned documents.
- Use Cases:
- Enable keyword search in scanned documents.
- Detect personally identifiable information (PII) for compliance and security.
Hands-On Example with Textract
- Access Textract: Navigate to Textract in the AWS Management Console.
- Analyzing a Document:
- Sample Form: Extracts raw text, entry fields, and table data.
- Process: Upload document for analysis by Textract.
Document Analysis Examples
- AWS Machine Learning Specialty Exam Guide (PDF):
- Content: Sections, tables, and varied text.
- Textract Analysis: Run Textract on the PDF document.
- Outcome: Partially successful extraction of text and tables.
- Results and Downloads
- Raw Text: Downloadable as a text file containing the content of the PDF.
- Table Data: Extracted and available in CSV format.
- Potential Applications: Feeding data into Elasticsearch or other search tools for document searchability.