In this blog post I will discuss following scenarios to connect to databases from AWS Lambda function: Connecting to Amazon Aurora PostgreSQL database in private subnet with public accessibility set to No in same AWS account.Connecting to cross account Amazon Redshift database in public subnet with public accessibility set to Yes. Connect to Amazon Aurora … Continue reading Connect to AWS Aurora PostgreSQL/Amazon Redshift Database from AWS Lambda
Use redshift-data api with AWS Glue Python Shell job
AWS Glue is a fully managed extract, transform, and load (ETL) service to process large amount of datasets from various sources for analytics and data processing. When you add a AWS Glue job, you can choose the job to be either Spark or Spark Streaming or Python shell type. For one of my use-case I … Continue reading Use redshift-data api with AWS Glue Python Shell job
Using AWS Data Wrangler with AWS Glue Job 2.0 and Amazon Redshift connection
I will admit, AWS Data Wrangler has become my go to package for developing extract, transform, and load (ETL) data pipelines and other day-to-day scripts. AWS Data Wrangler integration with multiple big data AWS services like S3, Glue Catalog, Athena, Databases, EMR, and others makes life simple for engineers. It also provides the ability to … Continue reading Using AWS Data Wrangler with AWS Glue Job 2.0 and Amazon Redshift connection
Redshift: Convert TEXT to Timestamp
How do you convert TEXT to timestamp in redshift? If the score column has data in given format, how can you display the timestamp. {"Choices":null, "timestamp":"1579650266955", "scaledScore":null} select cast(json_extract_path_text(score, 'timestamp') as timestamp) from schema.table limit 10; This sql will fail with -- ERROR: Invalid data DETAIL: ----------------------------------------------- error: Invalid data code: 8001 context: Invalid format … Continue reading Redshift: Convert TEXT to Timestamp