AWS Glue – Querying Nested JSON with Relationalize Transform

AWS Glue has transform Relationalize that can convert nested JSON into columns that you can then write to S3 or import into relational databases. As an example - Initial Schema: >>> df.printSchema() root |-- Id: string (nullable = true) |-- LastUpdated: long (nullable = true) |-- LastUpdatedBy: string (nullable = true) |-- Properties: struct (nullable … Continue reading AWS Glue – Querying Nested JSON with Relationalize Transform

Usecase with RDS Snapshot Export to S3

AWS recently announced "Amazon RDS Snapshot Export to S3" feature wherein you can now export Amazon Relational Database Service (Amazon RDS) or Amazon Aurora snapshots to Amazon S3 as Apache Parquet, an efficient open columnar storage format for analytics. I had a use-case to refresh Athena tables daily with full data set in Account B(us-east-1) … Continue reading Usecase with RDS Snapshot Export to S3