PostgreSQL – CPU Utilization and Index

One of the Production Aurora PostgreSQL instance running on db.r4.16xlarge instance (64 vCPU and 488 GB ) was reporting high CPU Utilization spiking upto 100%.

Screen Shot 2018-12-09 at 11.14.49 AM

With such issues, one of the first thing is to look for the SQLs with high buffers shared hit. I have built a small tool called pgsnap which is something similar to AWR respostory in Oracle maintaining the SQL stat history.  So, with pg_stat_statements and hist_pg_stat_statements(that’s what I call it), I was able to identify the SQL.

select col1, col2, col3, col4, col5 from cltn_errs redis0_ where redis0_.sbmn_id=123456;

Lets look at the execution plan

Gather  (cost=1000.00..2070031.26 rows=2696 width=262) (actual time=17475.126..18771.216 rows=1 loops=1)                                                       
   Output: col1, col2, col3, col4, col5                                   
   Workers Planned: 7                                                                                                                                           
   Workers Launched: 0                                                                                                                                          
   Buffers: shared hit=3945515                                                                                                                            
   ->  Parallel Seq Scan on demo.cltn_errs redis0_  (cost=0.00..2068761.66 rows=385 width=262) (actual time=17474.807..18770.895 rows=1 loops=1) 
         Output: col1, col2, col3, col4, col5                              
         Filter: (redis0_.sbmn_id = '123456'::numeric)                                                                                        
         Rows Removed by Filter: 52390761                                                                                                                       
         Buffers: shared hit=3945515                                                                                                                            
 Planning time: 0.652 ms  
 Execution time: 18771.384 ms      

The problem was obvious!! Full table scan of table cltn_errs (~15Gb table). So, I restored the latest snapshot, created an index on “sbmn_id” column and the execution plan changed to

Index Scan using idx_sbmn_id on demo.cltn_errs redis0_  (cost=0.57..182.80 rows=3086 width=272) (actual time=0.031..0.032 rows=1 loops=1) 
   Output: col1, col2, col3, col4, col5                                           
   Index Cond: (redis0_.sbmn_id = '123456'::numeric)                                                                                                 
   Buffers: shared hit=5                                                                                                                                 
 Planning time: 0.573 ms                                                                                                                                               
 Execution time: 0.085 ms        

Wow!! After the index, Buffers: Shared hit and the total execution time has magnificent improvement, more than 100x. So, with this testing, I created the index on Prod and after this little change CPU Utilization graph had to say it all.

Screen Shot 2018-12-09 at 11.58.06 AM

Advertisements

Python – Flatten List of Lists

Itertools is one of the most powerful module in Python. Today I had requirement to flatten list of lists and itertools made it so easy.

My list —

>> val = [['a','b'],'c',['d','e','f']]

Required Result

['a', 'b', 'c', 'd', 'e', 'f']

How do you do it? Itertools to the resuce —

>>> list(chain.from_iterable(val))
['a', 'b', 'c', 'd', 'e', 'f']

So simple !!

Python — Enumerate

Suppose you have a dataset (data) and want to find every 5th item.  How would you do it?

data = [ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

The first thing which could come to mind in using slice, but that won’t work as its based on index and index starts from 0.

>>> data[::5]
[1, 11]

The answer should be = [9, 19]

This is where enumerate comes in play. It allows us to loop over something and have an automatic counter.

>>> def get_nth_item(n=5):
...     new_data = []
...     for i, d in enumerate(data, 1):
...         if i % n == 0:
...             new_data.append(d)
...     print (new_data)
...
>>>
>>> get_nth_item(n=5)
[9, 19]
>>> get_nth_item(n=4)
[7, 15]
>>>

 

AWS DMS – Target TableName Differs

AWS DMS is a tool that supports both homogenous and heterogeneous migration, helping to migrate to aws cloud.

During most of the migrations, the source and target table names remain the same, in which case the Mappings.json file is pretty simple. As an example (Oracle to PostgreSQL)

 {
    "rules":
    [
        {
            "rule-type": "selection",
            "rule-id": "1",
            "rule-name": "1",
            "object-locator":
            {
                "schema-name": "DEVO",
                "table-name": "TEST_DEMO"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "transformation",
            "rule-id": "2",
            "rule-name": "2",
            "rule-action": "convert-lowercase",
            "rule-target": "schema",
            "object-locator":
            {
                "schema-name": "%"
            }
        },
        {
            "rule-type": "transformation",
            "rule-id": "3",
            "rule-name": "3",
            "rule-action": "convert-lowercase",
            "rule-target": "table",
            "object-locator":
            {
                "schema-name": "%",
                "table-name": "%"
            }
        },
        {
            "rule-type": "transformation",
            "rule-id": "4",
            "rule-name": "4",
            "rule-action": "convert-lowercase",
            "rule-target": "column",
            "object-locator":
            {
                "schema-name": "%",
                "table-name": "%",
                "column-name": "%"
            }
        }
    ]
}

The above Mappings.json, includes the table DEVO.TEST_DEMO to be migrated from Oracle to PostgreSQL, with the transformation of convert-lowercase for schema, table-name and column name.

But what if, the target table name is different, migrating from DEVO.TEST_DEMO to devo.test_demo_new. In such scenario, below Mappings.json can be used —

    {
      "rules": [
        {
          "rule-type": "selection",
          "rule-id": "1",
          "rule-name": "1",
          "object-locator": {
            "schema-name": "DEVO",
            "table-name": "TEST_DEMO"
          },
          "rule-action": "include"
        },
        {
          "rule-type": "transformation",
          "rule-id": "2",
          "rule-name": "2",
          "rule-action": "convert-lowercase",
          "rule-target": "schema",
          "object-locator": {
            "schema-name": "%"
          }
        },
        {
          "rule-type": "transformation",
          "rule-id": "3",
          "rule-name": "3",
          "rule-action": "rename",
          "rule-target": "table",
          "object-locator": {
            "schema-name": "devo",
            "table-name": "TEST_DEMO"
          },
          "value": "test_demo_new"
        },
        {
          "rule-type": "transformation",
          "rule-id": "4",
          "rule-name": "4",
          "rule-action": "convert-lowercase",
          "rule-target": "table",
          "object-locator": {
            "schema-name": "%",
            "table-name": "%"
          }
        },
        {
          "rule-type": "transformation",
          "rule-id": "5",
          "rule-name": "5",
          "rule-action": "convert-lowercase",
          "rule-target": "column",
          "object-locator": {
            "schema-name": "devo",
            "table-name": "test_demo",
            "column-name": "%"
          }
        }
      ]
    }

PostgreSQL – Unique constraint and null value

An important behavior in PostgreSQL to know about is the duplicate null values do not violate unique constraints.

Oracle

SQL> create table test (id number (2,0), 
                        country varchar(20) not null, 
                        state varchar(20)
                       );

Table created.

SQL> alter table test add constraint pk_test_id primary key (id);

Table altered.

SQL> alter table test add constraint uniq_test_cs unique (country, state);

Table altered.

PostgreSQL

admin@test # create table test (id numeric(2,0), 
                                country character varying(20) not null, 
                                state character varying(20)
                                );
CREATE TABLE
Time: 78.866 ms

admin@test # alter table test add constraint pk_test_id primary key (id);
ALTER TABLE
Time: 81.062 ms

admin@test # alter table test add constraint uniq_test_cs unique (country, state);
ALTER TABLE
Time: 82.800 ms

Let look at the way how null is handle with unique constraint in place.

Behaviour in Oracle

SQL> insert into test values (1, 'USA','SEATTLE');

1 row created.

SQL>  insert into test values (2, 'USA', 'OREGON');

1 row created.

SQL> insert into test (id, country) values (3, 'USA');

1 row created.

SQL> select * from test;

        ID COUNTRY              STATE
---------- -------------------- --------------------
         1 USA                  SEATTLE
         2 USA                  OREGON
         3 USA

SQL> insert into test (id, country) values (4, 'USA');
insert into test (id, country) values (4, 'USA')
*
ERROR at line 1:
ORA-00001: unique constraint (DEV.UNIQ_TEST_CS) violated


Behaviour in PostgreSQL

admin@test # insert into test values (1, 'USA','SEATTLE');
INSERT 0 1
Time: 79.928 ms

admin@test # insert into test values (2, 'USA', 'OREGON');
INSERT 0 1
Time: 72.490 ms

admin@test # insert into test (id, country) values (3, 'USA');
INSERT 0 1
Time: 75.906 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
|  3 | USA     | NULL    |
+----+---------+---------+
(3 rows)

Time: 74.928 ms
admin@test # insert into test (id, country) values (4, 'USA');
INSERT 0 1
Time: 76.040 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
|  3 | USA     | NULL    |
|  4 | USA     | NULL    |
+----+---------+---------+
(4 rows)

Time: 83.693 ms
admin@test #

2 ways to handle such situation —

1. Set default value for the column, which will case “duplicate key value violates” error.

2. Create Partial index.

admin@test # create unique index uniq_test_c on test(country) where state is null;
ERROR:  could not create unique index "uniq_test_c"
DETAIL:  Key (country)=(USA) is duplicated.
Time: 254.043 ms

admin@test # delete from test where id in (3,4);
DELETE 2
Time: 77.395 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
+----+---------+---------+
(2 rows)

Time: 77.451 ms

admin@test # create unique index uniq_test_c on test(country) where state is null;
CREATE INDEX
Time: 78.431 ms

admin@test # insert into test (id, country) values (3, 'USA');
INSERT 0 1
Time: 75.413 ms

admin@test # insert into test (id, country) values (4, 'USA');
ERROR:  duplicate key value violates unique constraint "uniq_test_c"
DETAIL:  Key (country)=(USA) already exists.
Time: 95.083 ms
admin@test #

As per PostgreSQL documentation

In general, a unique constraint is violated if there is more than one row in the table where the values of all of the columns included in the constraint are equal. However, two null values are never considered equal in this comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior conforms to the SQL standard, but we have heard that other SQL databases might not follow this rule. So be careful when developing applications that are intended to be portable

FillFactor for UPDATE

What is FillFactor in PostgreSQL?

As per offical Doc —

fillfactor (integer)

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables.

Does this ring any bell related to Oracle?

Yes, its PCTFREE.

The PCTFREE parameter sets the minimum percentage of a data block to be reserved as free space for possible updates to rows that already exist in that block.

In Postgres world, during an UPDATE the old tuple is marked as deleted by the current transaction and new tuple is inserted. Now depending on space availability within the same page, the new tuple can be inserted into it or in another page. Along with this, index pointer modification to the new tuple will be done.

anand@postgres # create table test (data varchar(2));
CREATE TABLE
Time: 7.049 ms

anand@postgres # insert into test values ('a');
INSERT 0 1
Time: 1.408 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |      0 |        0 | (0,1)  |           1 |       2050 |     24 | NULL   |  NULL | \x0561 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(1 row)

Time: 0.296 ms

anand@postgres # SELECT txid_current_snapshot();
+-----------------------+
| txid_current_snapshot |
+-----------------------+
| 1013:1013:            |
+-----------------------+
(1 row)

Time: 0.213 ms
anand@postgres # insert into test values ('b');
INSERT 0 1
Time: 1.237 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |      0 |        0 | (0,1)  |           1 |       2050 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2050 |     24 | NULL   |  NULL | \x0562 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(2 rows)

Time: 0.448 ms

anand@postgres # update test set data='c' where data='a';
UPDATE 1
Time: 1.806 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |   1014 |        0 | (0,3)  |       16385 |        258 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |   8096 |        1 |     26 |   1014 |      0 |        0 | (0,3)  |       32769 |      10242 |     24 | NULL   |  NULL | \x0563 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.352 ms

anand@postgres # delete from test where data='c';
DELETE 1
Time: 1.542 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |   1014 |        0 | (0,3)  |       16385 |       1282 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |   8096 |        1 |     26 |   1014 |   1015 |        0 | (0,3)  |       40961 |       8450 |     24 | NULL   |  NULL | \x0563 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.342 ms
anand@postgres # vacuum test;
VACUUM
Time: 12.944 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |      0 |        0 |      0 |   NULL |   NULL |     NULL | NULL   |        NULL |       NULL |   NULL | NULL   |  NULL | NULL   |
|  2 |   8160 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |      0 |        0 |      0 |   NULL |   NULL |     NULL | NULL   |        NULL |       NULL |   NULL | NULL   |  NULL | NULL   |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.536 ms
anand@postgres # vacuum full test;
VACUUM
Time: 15.169 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1013 |      0 |        0 | (0,1)  |           1 |       2818 |     24 | NULL   |  NULL | \x0562 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(1 row)

Time: 0.561 ms
anand@postgres #

Recently in Prod db, I noticed a table with primary key index having ~3500 updates/hr and each update, updated 1 row with mean_time of ~2-3seconds. The updates were eligibile for HOT.

+---------+------------------------+-----------+-------+------------------+
| snap_id |      sample_time       |  queryid  | calls |    mean_time     |
+---------+------------------------+-----------+-------+------------------+
|      99 | 2018-04-03 14:00:01+00 | 983545275 |  3551 | 8624.70651816391 |
|     100 | 2018-04-03 15:00:00+00 | 983545275 |  3831 | 485.123798225008 |
|     103 | 2018-04-03 18:00:01+00 | 983545275 |  3567 | 8752.98942528735 |
|     104 | 2018-04-03 19:00:01+00 | 983545275 |  3445 |  1173.0251509434 |
|     105 | 2018-04-03 20:00:00+00 | 983545275 |  3897 | 906.002405953299 |


Modifying the fill-factor for the table to 50 (keeping 50% space free for updates),  improved the performance by 1000x.

anand@postgres # alter table user.cache set (fillfactor = 50);
ALTER TABLE
Time: 126.374 ms
anand@postgres # vacuum full user.cahce;
VACUUM
Time: 21284.498 ms (00:21.284)
anand@postgres #

Reducing the fill factor increases the table size as during insert some space in the page will be left unused to future updates.

Post Change SQL Peformance —

+---------+------------------------+-----------+-------+-----------+-----------------+
| snap_id |      sample_time       |  queryid  | calls | mean_time | shared_blks_hit |
+---------+------------------------+-----------+-------+-----------+-----------------+
|     529 | 2018-04-22 13:00:01+00 | 983545275 |  3788 |      0.89 |         3219353 |
|     530 | 2018-04-22 14:00:00+00 | 983545275 |  4009 |      0.88 |         3413391 |
|     531 | 2018-04-22 15:00:00+00 | 983545275 |  4021 |      0.91 |         3564540 |
|     532 | 2018-04-22 16:00:01+00 | 983545275 |  4005 |      1.12 |         3721869 |
|     533 | 2018-04-22 17:00:01+00 | 983545275 |  4093 |      1.07 |         3901729 |
|     534 | 2018-04-22 18:00:01+00 | 983545275 |  4033 |      0.98 |         4071757 |
|     535 | 2018-04-22 19:00:01+00 | 983545275 |  3773 |      1.38 |         4232251 |
|     536 | 2018-04-22 20:00:01+00 | 983545275 |  3856 |      1.23 |         4372074 |
|     537 | 2018-04-22 21:00:00+00 | 983545275 |  3978 |      1.07 |         4535671 |
+---------+------------------------+-----------+-------+-----------+-----------------+

+------------------+---------+------------+----------+-------------+-----------+-----------+-----------+-----------+----------+------------+
|       relname    | relsize |   tblsp    | seq_scan |  idx_scan   | n_tup_ins | n_tup_upd | n_tup_del |   total   | hot_rate | fillfactor |
+------------------+---------+------------+----------+-------------+-----------+-----------+-----------+-----------+----------+------------+
|       cache      | 101 MB  | pg_default |       12 |     5926041 |     54404 |   1403579 |     59859 |   1923481 |    71.10 | 50         |

S3 Bucket – IllegalLocationConstraintException

A quick one !!

Creating bucket failed with “IllegalLocationConstraintException” —

[ec2-user@ip-172-31-33-128 ~]$ aws s3api create-bucket --bucket labtestrand --region us-west-2

An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.

Solution —

[ec2-user@ip-172-31-33-128 ~]$
[ec2-user@ip-172-31-33-128 ~]$ aws s3api create-bucket --bucket labtestrand --region us-west-2 --create-bucket-configuration LocationConstraint=us-west-2
{
"Location": "http://labtestrand.s3.amazonaws.com/"
}
[ec2-user@ip-172-31-33-128 ~]$