Using Amazon Polly

  1. You have decided to use Amazon Polly to produce your audio files. In this case, you have already one file per party (agent or customer) and it’s going to be mp3 files (which is default for Amazon Polly). The processing pipeline however expects audio files as they are produced by Amazon Connect, with one stereo channel for the customer and one for the agent, delivered as a wav file.

  2. To address the fact that you have already split audio files, you do not place your sample audio files in the virtual folder connect in your source bucket, but you directly place them in the ingestion bucket in the virtual folder recordings/Agent or recordings/Customer respectively. Amazon Connect would further categorize them by creation date with virtual subfolders for year, month and day of month, but that is not required to trigger the pipeline from here.

  3. To address the fact that the pipeline expects wav files, you need to replace the code of one of the Lambda functions. The relevant function is called execute_transcription_state_machine. Go to the Lambda console, find this function in the data platform region. pipeline.change-lambda.01.png Now scroll a bit down until you see the code of the Lambda function. It’s written in Python and you can exchange the code right in the browser. Copy the new function code as shown below the next screen shot and paste it into the editor instead of the existing code. Don’t forget to hit the Save button in the upper right corner. pipeline.change-lambda.02.png

import boto3
import os
import json
client = boto3.client('stepfunctions')

def lambda_handler(event, context):
        print("Incoming event: " + str(event))
        s3_object = event["Records"][0]["s3"]
        key = s3_object["object"]["key"]
        bucket_name = s3_object["bucket"]["name"]
        transcribe_state_machine_arn = os.environ['TRANSCRIBE_STATE_MACHINE_ARN']
        region = os.environ['AWS_REGION']
        file_uri = form_key_uri(bucket_name, key, region)
        job_name = get_job_name(key)
        media_format = os.environ["MEDIA_FORMAT"]
        language_code = os.environ["LANGUAGE_CODE"]
        print("s3_object: " + str(s3_object))
        print("key: " + key)
        print("bucket_name: " + bucket_name)
        print("transcribe_state_machine_arn: " + transcribe_state_machine_arn)
        print("region: " + region)
        print("file_uri: " + file_uri)
        print("job_name: " + job_name)
        print("media_format: " + media_format)
        print("language_code: " + language_code)

        # Trying to extract media format from object key:
        print("Trying to extract media format from object key:")
        supported_audio_formats_list = ["mp3", "mp4", "wav", "flac"]
        print("Supported audio formats: " + str(supported_audio_formats_list))
        split_list = key.rsplit('.', 1)
        print("Result of attempt to split file name extension from object key: " + str(split_list))
        print("Size of list: " + str(len(split_list)))
        if len(split_list) >= 2:
            # At least the object key contained a dot.
            temporary_media_format = split_list[len(split_list) - 1]
            print("Extracted temporary media format from object key as: " + temporary_media_format)
            # Check if that is a supported format.
            if temporary_media_format in supported_audio_formats_list:
                print("Temporary media format is member of supported formats.")
                media_format = temporary_media_format
                # Falling back to environment variable settings.
                print("Falling back to media format determined by environment variable settings: " + media_format)
            # Falling back to environment variable settings.
            print("Falling back to media format determined by environment variable settings: " + media_format)

        execution_input = {
          "jobName": job_name,
          # "mediaFormat": os.environ["MEDIA_FORMAT"],
          "mediaFormat": media_format,
          "fileUri": file_uri,
          "languageCode": os.environ["LANGUAGE_CODE"],
          "transcriptDestination": os.environ["TRANSCRIPTS_DESTINATION"],
          "wait_time": os.environ["WAIT_TIME"]
        print("Resulting execution input: " + str(execution_input))
        response = client.start_execution(
        return "hello"
    except Exception as e:
        raise e

def get_job_name(key):
    key_name_without_the_file_type = key.split('.')[0]
    keys = key_name_without_the_file_type.split('/')
    keys = keys[len(keys)-1].split("%") #THIS IS TO CLEAN UP CHARACTERS NOT ALLOWED BY TRANSCRIBE JOB
    return keys[0]

def form_key_uri(bucket_name,key,region):
    return ""+bucket_name+"/"+key