PostgreSQL – Unique constraint and null value

An important behavior in PostgreSQL to know about is the duplicate null values do not violate unique constraints.

Oracle

SQL> create table test (id number (2,0), 
                        country varchar(20) not null, 
                        state varchar(20)
                       );

Table created.

SQL> alter table test add constraint pk_test_id primary key (id);

Table altered.

SQL> alter table test add constraint uniq_test_cs unique (country, state);

Table altered.

PostgreSQL

admin@test # create table test (id numeric(2,0), 
                                country character varying(20) not null, 
                                state character varying(20)
                                );
CREATE TABLE
Time: 78.866 ms

admin@test # alter table test add constraint pk_test_id primary key (id);
ALTER TABLE
Time: 81.062 ms

admin@test # alter table test add constraint uniq_test_cs unique (country, state);
ALTER TABLE
Time: 82.800 ms

Let look at the way how null is handle with unique constraint in place.

Behaviour in Oracle

SQL> insert into test values (1, 'USA','SEATTLE');

1 row created.

SQL>  insert into test values (2, 'USA', 'OREGON');

1 row created.

SQL> insert into test (id, country) values (3, 'USA');

1 row created.

SQL> select * from test;

        ID COUNTRY              STATE
---------- -------------------- --------------------
         1 USA                  SEATTLE
         2 USA                  OREGON
         3 USA

SQL> insert into test (id, country) values (4, 'USA');
insert into test (id, country) values (4, 'USA')
*
ERROR at line 1:
ORA-00001: unique constraint (DEV.UNIQ_TEST_CS) violated


Behaviour in PostgreSQL

admin@test # insert into test values (1, 'USA','SEATTLE');
INSERT 0 1
Time: 79.928 ms

admin@test # insert into test values (2, 'USA', 'OREGON');
INSERT 0 1
Time: 72.490 ms

admin@test # insert into test (id, country) values (3, 'USA');
INSERT 0 1
Time: 75.906 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
|  3 | USA     | NULL    |
+----+---------+---------+
(3 rows)

Time: 74.928 ms
admin@test # insert into test (id, country) values (4, 'USA');
INSERT 0 1
Time: 76.040 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
|  3 | USA     | NULL    |
|  4 | USA     | NULL    |
+----+---------+---------+
(4 rows)

Time: 83.693 ms
admin@test #

2 ways to handle such situation —

1. Set default value for the column, which will case “duplicate key value violates” error.

2. Create Partial index.

admin@test # create unique index uniq_test_c on test(country) where state is null;
ERROR:  could not create unique index "uniq_test_c"
DETAIL:  Key (country)=(USA) is duplicated.
Time: 254.043 ms

admin@test # delete from test where id in (3,4);
DELETE 2
Time: 77.395 ms
admin@test # select * from test;
+----+---------+---------+
| id | country |  state  |
+----+---------+---------+
|  1 | USA     | SEATTLE |
|  2 | USA     | OREGON  |
+----+---------+---------+
(2 rows)

Time: 77.451 ms

admin@test # create unique index uniq_test_c on test(country) where state is null;
CREATE INDEX
Time: 78.431 ms

admin@test # insert into test (id, country) values (3, 'USA');
INSERT 0 1
Time: 75.413 ms

admin@test # insert into test (id, country) values (4, 'USA');
ERROR:  duplicate key value violates unique constraint "uniq_test_c"
DETAIL:  Key (country)=(USA) already exists.
Time: 95.083 ms
admin@test #

As per PostgreSQL documentation

In general, a unique constraint is violated if there is more than one row in the table where the values of all of the columns included in the constraint are equal. However, two null values are never considered equal in this comparison. That means even in the presence of a unique constraint it is possible to store duplicate rows that contain a null value in at least one of the constrained columns. This behavior conforms to the SQL standard, but we have heard that other SQL databases might not follow this rule. So be careful when developing applications that are intended to be portable

Advertisements

FillFactor for UPDATE

What is FillFactor in PostgreSQL?

As per offical Doc —

fillfactor (integer)

The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactors are appropriate. This parameter cannot be set for TOAST tables.

Does this ring any bell related to Oracle?

Yes, its PCTFREE.

The PCTFREE parameter sets the minimum percentage of a data block to be reserved as free space for possible updates to rows that already exist in that block.

In Postgres world, during an UPDATE the old tuple is marked as deleted by the current transaction and new tuple is inserted. Now depending on space availability within the same page, the new tuple can be inserted into it or in another page. Along with this, index pointer modification to the new tuple will be done.

anand@postgres # create table test (data varchar(2));
CREATE TABLE
Time: 7.049 ms

anand@postgres # insert into test values ('a');
INSERT 0 1
Time: 1.408 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |      0 |        0 | (0,1)  |           1 |       2050 |     24 | NULL   |  NULL | \x0561 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(1 row)

Time: 0.296 ms

anand@postgres # SELECT txid_current_snapshot();
+-----------------------+
| txid_current_snapshot |
+-----------------------+
| 1013:1013:            |
+-----------------------+
(1 row)

Time: 0.213 ms
anand@postgres # insert into test values ('b');
INSERT 0 1
Time: 1.237 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |      0 |        0 | (0,1)  |           1 |       2050 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2050 |     24 | NULL   |  NULL | \x0562 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(2 rows)

Time: 0.448 ms

anand@postgres # update test set data='c' where data='a';
UPDATE 1
Time: 1.806 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |   1014 |        0 | (0,3)  |       16385 |        258 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |   8096 |        1 |     26 |   1014 |      0 |        0 | (0,3)  |       32769 |      10242 |     24 | NULL   |  NULL | \x0563 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.352 ms

anand@postgres # delete from test where data='c';
DELETE 1
Time: 1.542 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1012 |   1014 |        0 | (0,3)  |       16385 |       1282 |     24 | NULL   |  NULL | \x0561 |
|  2 |   8128 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |   8096 |        1 |     26 |   1014 |   1015 |        0 | (0,3)  |       40961 |       8450 |     24 | NULL   |  NULL | \x0563 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.342 ms
anand@postgres # vacuum test;
VACUUM
Time: 12.944 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |      0 |        0 |      0 |   NULL |   NULL |     NULL | NULL   |        NULL |       NULL |   NULL | NULL   |  NULL | NULL   |
|  2 |   8160 |        1 |     26 |   1013 |      0 |        0 | (0,2)  |           1 |       2306 |     24 | NULL   |  NULL | \x0562 |
|  3 |      0 |        0 |      0 |   NULL |   NULL |     NULL | NULL   |        NULL |       NULL |   NULL | NULL   |  NULL | NULL   |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(3 rows)

Time: 0.536 ms
anand@postgres # vacuum full test;
VACUUM
Time: 15.169 ms
anand@postgres # select * from heap_page_items(get_raw_page('test',0));
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
| lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
|  1 |   8160 |        1 |     26 |   1013 |      0 |        0 | (0,1)  |           1 |       2818 |     24 | NULL   |  NULL | \x0562 |
+----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+--------+
(1 row)

Time: 0.561 ms
anand@postgres #

Recently in Prod db, I noticed a table with primary key index having ~3500 updates/hr and each update, updated 1 row with mean_time of ~2-3seconds. The updates were eligibile for HOT.

+---------+------------------------+-----------+-------+------------------+
| snap_id |      sample_time       |  queryid  | calls |    mean_time     |
+---------+------------------------+-----------+-------+------------------+
|      99 | 2018-04-03 14:00:01+00 | 983545275 |  3551 | 8624.70651816391 |
|     100 | 2018-04-03 15:00:00+00 | 983545275 |  3831 | 485.123798225008 |
|     103 | 2018-04-03 18:00:01+00 | 983545275 |  3567 | 8752.98942528735 |
|     104 | 2018-04-03 19:00:01+00 | 983545275 |  3445 |  1173.0251509434 |
|     105 | 2018-04-03 20:00:00+00 | 983545275 |  3897 | 906.002405953299 |


Modifying the fill-factor for the table to 50 (keeping 50% space free for updates),  improved the performance by 1000x.

anand@postgres # alter table user.cache set (fillfactor = 50);
ALTER TABLE
Time: 126.374 ms
anand@postgres # vacuum full user.cahce;
VACUUM
Time: 21284.498 ms (00:21.284)
anand@postgres #

Reducing the fill factor increases the table size as during insert some space in the page will be left unused to future updates.

Post Change SQL Peformance —

+---------+------------------------+-----------+-------+-----------+-----------------+
| snap_id |      sample_time       |  queryid  | calls | mean_time | shared_blks_hit |
+---------+------------------------+-----------+-------+-----------+-----------------+
|     529 | 2018-04-22 13:00:01+00 | 983545275 |  3788 |      0.89 |         3219353 |
|     530 | 2018-04-22 14:00:00+00 | 983545275 |  4009 |      0.88 |         3413391 |
|     531 | 2018-04-22 15:00:00+00 | 983545275 |  4021 |      0.91 |         3564540 |
|     532 | 2018-04-22 16:00:01+00 | 983545275 |  4005 |      1.12 |         3721869 |
|     533 | 2018-04-22 17:00:01+00 | 983545275 |  4093 |      1.07 |         3901729 |
|     534 | 2018-04-22 18:00:01+00 | 983545275 |  4033 |      0.98 |         4071757 |
|     535 | 2018-04-22 19:00:01+00 | 983545275 |  3773 |      1.38 |         4232251 |
|     536 | 2018-04-22 20:00:01+00 | 983545275 |  3856 |      1.23 |         4372074 |
|     537 | 2018-04-22 21:00:00+00 | 983545275 |  3978 |      1.07 |         4535671 |
+---------+------------------------+-----------+-------+-----------+-----------------+

+------------------+---------+------------+----------+-------------+-----------+-----------+-----------+-----------+----------+------------+
|       relname    | relsize |   tblsp    | seq_scan |  idx_scan   | n_tup_ins | n_tup_upd | n_tup_del |   total   | hot_rate | fillfactor |
+------------------+---------+------------+----------+-------------+-----------+-----------+-----------+-----------+----------+------------+
|       cache      | 101 MB  | pg_default |       12 |     5926041 |     54404 |   1403579 |     59859 |   1923481 |    71.10 | 50         |

Oracle to Postgres — Index Skip Scan

I have been working on Oracle to Postgres migration and it has been a fun ride.

One of the recent encounters during migration was performance issue in Postgres for a query, actually quite a simple query.

SELECT count(*) AS num_incidents FROM audit_cs acs INNER JOIN audit_changes ac ON acs.audit_change_id = ac.audit_change_set WHERE acs.object_id=$1 AND ac.path = 'Flag' AND ac.new_value = 'sign' 

Let’s look at the execution plan in Postgres

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                      QUERY PLAN                                                                                                       |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=416855.54..416855.55 rows=1 width=8) (actual time=1885.330..1885.331 rows=1 loops=1)                                                                                                                 |
|   Output: count(*)                                                                                                                                                                                                    |
|   Buffers: shared hit=192660 read=3                                                                                                                                                                                   |
|   ->  Nested Loop  (cost=1.13..416855.54 rows=1 width=0) (actual time=1885.324..1885.324 rows=0 loops=1)                                                                                                              |
|         Buffers: shared hit=192660 read=3                                                                                                                                                                             |
|         ->  Index Scan using i_ot_oi_cacs_comp on user.audit_cs acs  (cost=0.56..416382.89 rows=38 width=8) (actual time=1870.631..1882.638 rows=3 loops=1)                                                           |
|               Output: acs.audit_change_id, acs.date_record_added, acs.object_type, acs.object_id, acs.note, acs.created_by_user_id                                                                                    |
|               Index Cond: (acs.object_id = '20896385'::numeric)                                                                                                                                                       |
|               Buffers: shared hit=192647 read=1                                                                                                                                                                       |
|         ->  Index Scan using i_acs_audit_changes on user.audit_changes ac  (cost=0.56..12.42 rows=2 width=8) (actual time=0.890..0.890 rows=0 loops=3)                                                                |
|               Output: ac.audit_change_id, ac.path, ac.old_value, ac.new_value, ac.audit_change_set, ac.created_by                                                                                                     |
|               Index Cond: (ac.audit_change_set = acs.audit_change_id)                                                                                                                                                 |
|               Filter: (((ac.path)::text = 'Flag'::text) AND ((ac.new_value)::text = 'sign'::text))                                                                                                                    |
|               Rows Removed by Filter: 1                                                                                                                                                                               |
|               Buffers: shared hit=13 read=2                                                                                                                                                                           |
| Planning time: 0.932 ms                                                                                                                                                                                               |
| Execution time: 1885.401 ms                                                                                                                                                                                           |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)

On a quick glance of the execution plan, it looks good as its reading the data using index, but to notice is the execution time of ~1.9secs and Buffers: shared hit of ~192k for index i_ot_oi_cacs_comp.

For the same sql, in Oracle, the optimizer uses Index Skip Scan on i_ot_oi_cacs_comp and completes within 2ms.

Does this give some hint 😉

Can the performance in Postgres be improved?

Firstly, lets review the definition of composite Index

"i_ot_oi_cacs_comp" btree (object_type, object_id)

Secondly, pg_stats gives a good understanding

select tablename, attname, inherited, null_frac, avg_width, n_distinct from pg_stats where tablename='audit_cs'

+------------------------+---------------------+-----------+-----------+-----------+------------+
|       tablename        |       attname       | inherited | null_frac | avg_width | n_distinct |
+------------------------+---------------------+-----------+-----------+-----------+------------+

|       audit_cs         | object_type         | f         |         0 |        14 |          1 |
|       audit_cs         | object_id           | f         |         0 |         6 |     750343 |

In the composte index, column object_type comes first and then object_id and Oracle handled it well with Index Skip Scan.

But do we really need object_type to be first column. The column object_id has much more distinct values when compared to object_type column. What happens if we change the column position for the index to

"i_oiot_cacs_comp" btree (object_id, object_type)

Lets the execution plan now

+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                                      QUERY PLAN                                                                                                       |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Aggregate  (cost=506.39..506.40 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1)                                                                                                                             |
|   Output: count(*)                                                                                                                                                                                                    |
|   Buffers: shared hit=21                                                                                                                                                                                              |
|   ->  Nested Loop  (cost=1.13..506.39 rows=1 width=0) (actual time=0.043..0.043 rows=0 loops=1)                                                                                                                       |
|         Buffers: shared hit=21                                                                                                                                                                                        |
|         ->  Index Scan using i_oiot_cacs_comp on user.audit_cs acs  (cost=0.56..24.18 rows=39 width=8) (actual time=0.014..0.017 rows=3 loops=1)                                                                      |
|               Output: acs.audit_change_id, acs.date_record_added, acs.object_type, acs.object_id, acs.note, acs.created_by_user_id                                                                                    |
|               Index Cond: (acs.object_id = '20896385'::numeric)                                                                                                                                                       |
|               Buffers: shared hit=6                                                                                                                                                                                   |
|         ->  Index Scan using i_acs_audit_changes on user.audit_changes ac  (cost=0.56..12.34 rows=2 width=8) (actual time=0.007..0.007 rows=0 loops=3)                                                                |
|               Output: ac.audit_change_id, ac.path, ac.old_value, ac.new_value, ac.audit_change_set, ac.created_by                                                                                                     |
|               Index Cond: (ac.audit_change_set = acs.audit_change_id)                                                                                                                                                 |
|               Filter: (((ac.path)::text = 'Flag'::text) AND ((ac.new_value)::text = 'sign'::text))                                                                                                              |
|               Rows Removed by Filter: 1                                                                                                                                                                               |
|               Buffers: shared hit=15                                                                                                                                                                                  |
| Planning time: 0.441 ms                                                                                                                                                                                               |
| Execution time: 0.092 ms                                                                                                                                                                                              |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(17 rows)

Just with the simple change in column position, has reduced Buffer: Shared hits from 192K to 21 and execution time from ~2secs to less than 1ms.

+---------+------------------------+------------+------------------+---------------------------------------+-------------+-------+---------+----------------+
| snap_id |      sample_time       |  queryid   |    get_sqlid     |              short_query              | total_time  | calls |  mean   | percentage_cpu |
+---------+------------------------+------------+------------------+---------------------------------------+-------------+-------+---------+----------------+
|      41 | 2018-03-30 09:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 39237541.19 | 16794 | 2336.40 |           7.59 |
|      42 | 2018-03-30 10:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 35176645.84 | 16448 | 2138.66 |           6.80 |
|      43 | 2018-03-30 11:17:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 29582400.11 | 15951 | 1854.58 |           5.72 |
|      44 | 2018-03-30 12:17:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 34728899.68 | 16244 | 2137.95 |           6.72 |
|      46 | 2018-03-30 14:17:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 37036501.98 | 16442 | 2252.55 |           7.16 |
|      47 | 2018-03-30 15:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 35525730.37 | 15463 | 2297.47 |           6.87 |
|      49 | 2018-03-30 17:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 28899674.51 | 12425 | 2325.93 |           5.59 |
|      50 | 2018-03-30 18:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 35606631.89 | 14848 | 2398.08 |           6.89 |
|      51 | 2018-03-30 19:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 36339708.61 | 16049 | 2264.30 |           7.03 |
|      52 | 2018-03-30 20:17:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM | 29955685.60 | 13621 | 2199.23 |           5.79 |
|      73 | 2018-03-31 18:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    22705.92 | 31999 |    0.71 |           0.00 |
|      77 | 2018-03-31 23:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    13896.36 | 17489 |    0.79 |           0.00 |
|      78 | 2018-04-01 00:17:00+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    21180.69 | 37533 |    0.56 |           0.00 |
|      89 | 2018-04-03 04:00:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    37948.14 | 30610 |    1.24 |           0.01 |
|     106 | 2018-04-03 21:00:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    17327.80 | 21625 |    0.80 |           0.00 |
|     107 | 2018-04-03 22:00:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |    10792.78 | 13785 |    0.78 |           0.00 |
|     163 | 2018-04-06 06:00:01+00 | 2563607162 | 11a53a3f4909b816 | SELECT count(*) AS num_incidents FROM |     9255.47 | 16829 |    0.55 |           0.00 |
+---------+------------------------+------------+------------------+---------------------------------------+-------------+-------+---------+----------------+

So, when migrating from Oracle to Postgres, make sure to keep an eye on SQLs which perform Index Skip Scan in Oracle.

PostgreSQL – An introduction

2017 – A year of learning !!

Today technology is changing at a very rapid speed and it’s very difficult to keep up to it but as said, it’s never too late to start something new. This year I plan to spend some time outside of Oracle and learn some new things.

I have started learning PostgreSQL and I want to take this platform to share my learning and to learn from others.  As this is the start, it would very basic and hopefully with time, I will share more interesting things.

So, without wasting any more time, lets get started.

This blog is about PostgreSQL installation on MAC and a little more.

As HomeBrew is already running on my mac, I will use “Brew” to install PostgreSQL.

HomeBrew is a free and open-source software package management system that simplifies the installation of software on Apple’s macOS operating system.

You can read more about it on —

https://en.wikipedia.org/wiki/Homebrew_(package_management_software)

http://brew.sh/

Brew Installation is very simple.


bash-3.2$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
==> This script will install:
/usr/local/bin/brew
/usr/local/share/doc/homebrew
/usr/local/share/man/man1/brew.1
/usr/local/share/zsh/site-functions/_brew
/usr/local/etc/bash_completion.d/brew
/usr/local/Homebrew

Once installed, PostgreSQL installation is just one line command :-


bash-3.2$ brew --version
Homebrew 1.1.8
Homebrew/homebrew-core (git revision 3605; last commit 2017-01-23)
bash-3.2$
bash-3.2$
bash-3.2$ brew install postgres
==> Downloading https://homebrew.bintray.com/bottles/postgresql-9.6.1.el_capitan.bottle.tar.gz
######################################################################## 100.0%
==> Pouring postgresql-9.6.1.el_capitan.bottle.tar.gz
==> Using the sandbox
==> /usr/local/Cellar/postgresql/9.6.1/bin/initdb /usr/local/var/postgres
==> Caveats
If builds of PostgreSQL 9 are failing and you have version 8.x installed,
you may need to remove the previous version first. See:
https://github.com/Homebrew/homebrew/issues/2510

To migrate existing data from a previous major version (pre-9.0) of PostgreSQL, see:
https://www.postgresql.org/docs/9.6/static/upgrading.html

To migrate existing data from a previous minor version (9.0-9.5) of PostgreSQL, see:
https://www.postgresql.org/docs/9.6/static/pgupgrade.html

You will need your previous PostgreSQL installation from brew to perform `pg_upgrade`.
Do not run `brew cleanup postgresql` until you have performed the migration.

To have launchd start postgresql now and restart at login:
brew services start postgresql
Or, if you don't want/need a background service you can just run:
pg_ctl -D /usr/local/var/postgres start
==> Summary
🍺  /usr/local/Cellar/postgresql/9.6.1: 3,242 files, 36.4M
bash-3.2$

 Now, to start the PostgreSQL services

bash-3.2$ brew services start postgresql
==> Tapping homebrew/services
Cloning into '/usr/local/Homebrew/Library/Taps/homebrew/homebrew-services'...
remote: Counting objects: 10, done.
remote: Compressing objects: 100% (7/7), done.
remote: Total 10 (delta 0), reused 5 (delta 0), pack-reused 0
Unpacking objects: 100% (10/10), done.
Tapped 0 formulae (37 files, 50.7K)
==> Successfully started `postgresql` (label: homebrew.mxcl.postgresql)
bash-3.2$
bash-3.2$ ps -ef | grep postgres
1463626507 35179     1   0  7:38AM ??         0:00.03 /usr/local/opt/postgresql/bin/postgres -D /usr/local/var/postgres
1463626507 35182 35179   0  7:38AM ??         0:00.00 postgres: checkpointer process
1463626507 35183 35179   0  7:38AM ??         0:00.01 postgres: writer process
1463626507 35184 35179   0  7:38AM ??         0:00.00 postgres: wal writer process
1463626507 35185 35179   0  7:38AM ??         0:00.00 postgres: autovacuum launcher process
1463626507 35186 35179   0  7:38AM ??         0:00.00 postgres: stats collector process
1463626507 35191 26530   0  7:38AM ttys004    0:00.00 grep postgres
bash-3.2$
bash-3.2$ postgres --version
postgres (PostgreSQL) 9.6.1

Few things to note here :-

1. The postgresql services can be stoped/started either using “brew services” or pg_ctl utility. “Brew services” will only be avaliable in case you have installed homebrew.

2. /usr/local/var/postgres –> Is the data directory. The location can be different depending on the OS. It contains configuration file, pid detail file, tablespaces and other details. I will post more on this in another blog.

3. The PID of the postgres process can be found using the ps program, or from the file postmaster.pid in the data directory.

4. PostgreSQL postmaster is the master server process. postmaster is a deprecated alias of postgres.

bash-3.2$ cd /usr/local/var/postgres
bash-3.2$ cat postmaster.pid
59448   -->  PID 
/usr/local/var/postgres  --> Data Directory 
1485658006
5432  --> Port
/tmp --> Unix socket
localhost
  5432001   1310723

postgres=# \conninfo
You are connected to database "postgres" as user "prakanan" via socket in "/tmp" at port "5432".

5. After the service is up and running, you can connect to the database using psql, similar to sqlplus in Oracle.

bash-3.2$ psql
psql: FATAL:  database "anand" does not exist
bash-3.2$

By default, psql tries to connect to dbname with the username its installed.

6. PostgreSQL comes with 3 default databases
i. template0
ii. template1
iii. postgres

The template0 and template1 databases are known as template databases. The teplate1 database can be changed to allow to created a localized template for any new db to create. The template0 exists so so that when you alter template1, you still have a pristine copy to fall back on.

By default, the new database will be created by cloning the standard system database template1.

The template0 database is normally marked datallowconn = false to prevent its modification.

If you have worked on Oracle 12c multitenant db, then you can think of template0 as CDB and template1 as PDB.

7. Two tablespaces are automatically created when the database cluster is initialized. The pg_global tablespace is used for shared system catalogs. The pg_default tablespace is the default tablespace of the template1 and template0 databases

8. As Oracle has Alert log, PostgreSQL has kind of similar logfile. In MAC installation, the location is /usr/local/var/log.

You can look at other installation options at

https://www.postgresql.org/download/

In the next blog post, I will show how to connect to database and different kind of start/stop mode.