Use redshift-data api with AWS Glue Python Shell job

AWS Glue is a fully managed extract, transform, and load (ETL) service to process large amount of datasets from various sources for analytics and data processing. When you add a AWS Glue job, you can choose the job to be either Spark or Spark Streaming or Python shell type. For one of my use-case I … Continue reading Use redshift-data api with AWS Glue Python Shell job

Using AWS Data Wrangler with AWS Glue Job 2.0 and Amazon Redshift connection

I will admit, AWS Data Wrangler has become my go to package for developing extract, transform, and load (ETL) data pipelines and other day-to-day scripts. AWS Data Wrangler integration with multiple big data AWS services like S3, Glue Catalog, Athena, Databases, EMR, and others makes life simple for engineers. It also provides the ability to … Continue reading Using AWS Data Wrangler with AWS Glue Job 2.0 and Amazon Redshift connection

Redshift: Convert TEXT to Timestamp

How do you convert TEXT to timestamp in redshift? If the score column has data in given format, how can you display the timestamp. {"Choices":null, "timestamp":"1579650266955", "scaledScore":null} select cast(json_extract_path_text(score, 'timestamp') as timestamp) from schema.table limit 10; This sql will fail with -- ERROR: Invalid data DETAIL: ----------------------------------------------- error: Invalid data code: 8001 context: Invalid format … Continue reading Redshift: Convert TEXT to Timestamp