Beginners guide to modifying data in PostgreSQL

There are three primary modification operations we can perform in SQL, create, update, and delete. In this blog post, we'll cover examples and explanations of each.

We'll be using the following data model for the examples.

Inserting Data

Here's an example of inserting two new rows into our patients table.

insert into patients (id, firstname, lastname, phone)
values (1, 'Terry', 'Avila', '540-923-0088'),
       (2, 'Dorothy', 'Buckler', '334-422-6232')

Technically, the column specification is optional here since I'm providing values for all columns (example below), but I recommend including them for future proofing your inserts.

insert into patients
values (1, 'Terry', 'Avila', '540-923-0088'),
       (2, 'Dorothy', 'Buckler', '334-422-6232')

One final note is that PostgreSQL does have a column level setting that we are not taking advantage of here called SERIAL. It automatically generates a unique, incrementing integer for each new row. I recommend using this when possible on your primary key field to simplify your inserts.

Updating Data

Below is an example of updating a single row's values for last name and phone number. Be very cautious when performing modifications of existing data in SQL, as simply forgetting the where clause will result in the entire table being modified. Performing routine backups of your database is essential, and you should always try your updates in a safe isolated environment before performing changes on a live system.

update patients
set lastname = 'Bucklar',
    phone = '334-123-9372'
where id = 2;

Here is a contrived example where we wish to set the last name of one patient equal to the last name of a different patient. Notice how we're performing a scalar subquery here to retrieve the new value we wish to apply.

update patients pats
set lastname = (
    select lastname from patients where id = 0
)
where pats.id = 1;

PostgreSQL also has a bonus feature not present in standard SQL that allows you to perform the same query in a more readable way.

update patients pats1
set lastname = pats2.lastname
from (select * from patients where id = 0) pats2
where pats1.id = 1;

Deleting Data

Let's consider a scenario where we want to clear our some of our unnecessary data by deleting all patients who have no encounter associated with them.

delete from patients
where id not in (
    select patientid
        from encounters
)

This performs a subquery on encounters, pulling out the patientid's on record for each encounter. We then check to see if a patient id is in that result set. If it isn't then we delete it.

Here is an alternative way to perform this query, which uses a correlated subquery.

delete from patients
where not exists (
    select 1
        from encounters
    where encounters.patientid = patients.id
)

These two queries have different performance characteristics, but it's worth noting that the query optimizer may choose to modify the behavior to benefit performance. In general, correlated subqueries are hard to optimize.