<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Nathan Robinson's blog]]></title><description><![CDATA[Nathan Robinson's blog]]></description><link>https://blog.robinhart.me</link><generator>RSS for Node</generator><lastBuildDate>Tue, 07 Apr 2026 21:32:20 GMT</lastBuildDate><atom:link href="https://blog.robinhart.me/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Introduction to PostgreSQL Aggregate Functions]]></title><description><![CDATA[Aggregation is a really helpful tool in the SQL arsenal, and it enables us to ask slightly more sophisticated questions of our data than we would be able to otherwise. This is a slightly more advanced topic, so be sure to brush up on basic SQL syntax...]]></description><link>https://blog.robinhart.me/introduction-to-postgresql-aggregate-functions</link><guid isPermaLink="true">https://blog.robinhart.me/introduction-to-postgresql-aggregate-functions</guid><category><![CDATA[PostgreSQL]]></category><category><![CDATA[aggregation]]></category><category><![CDATA[aggregate function]]></category><category><![CDATA[SQL]]></category><dc:creator><![CDATA[Nathan Robinson]]></dc:creator><pubDate>Sat, 11 May 2024 18:04:13 GMT</pubDate><content:encoded><![CDATA[<p>Aggregation is a really helpful tool in the SQL arsenal, and it enables us to ask slightly more sophisticated questions of our data than we would be able to otherwise. This is a slightly more advanced topic, so be sure to brush up on basic SQL syntax before jumping into this article.</p>
<p>We'll be using the following data model for the examples.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1715444790964/450749b9-8e8c-4d70-a4b2-7ab32383ac79.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-count">Count</h2>
<p>Here is a relatively simple example of when <code>COUNT</code> can be useful. We're calculating the number of expensive locations, which are those with a per-encounter cost of over $100. Here the <code>WHERE</code> clause is filtering out the cheap rows before our aggregation function operates against the resulting record set and produces a scalar value.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> <span class="hljs-keyword">count</span>(*)
    <span class="hljs-keyword">from</span> locations
    <span class="hljs-keyword">where</span> <span class="hljs-keyword">cost</span> &gt; <span class="hljs-number">100</span>
</code></pre>
<h2 id="heading-group-by">Group by</h2>
<p>Group by allows us to slightly modify the behavior of our aggregation function in an interesting way. Consider the below example where we've chosen to group our encounters by patient ID and then apply the <code>COUNT</code> aggregation function. Because the aggregation is happening for each group, the output here is the number of encounters a given patient has had.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> patientid, <span class="hljs-keyword">count</span>(*)
    <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> patientid
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> patientid
</code></pre>
<h2 id="heading-sum">Sum</h2>
<p>In this example, we are calculating the total revenue for each of our business' locations. First we are joining our location and encounters tables to produce an intermediate record set that has all encounters listed, along with the associated cost for each encounter. We group these per location and <code>SUM</code> adds up the cost values for each location.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> <span class="hljs-keyword">name</span>, <span class="hljs-keyword">sum</span>(<span class="hljs-keyword">cost</span>) <span class="hljs-keyword">as</span> revenue
    <span class="hljs-keyword">from</span> locations loc
    <span class="hljs-keyword">inner</span> <span class="hljs-keyword">join</span> encounters enc
        <span class="hljs-keyword">on</span> loc.id = enc.locid
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> <span class="hljs-keyword">name</span>
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> revenue
</code></pre>
<h2 id="heading-extract">Extract</h2>
<p>If we want to calculate the total number of encounters per location in April 2024, we could do it this way.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> locid, <span class="hljs-keyword">count</span>(*)
    <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">where</span> starttime &gt;= <span class="hljs-string">'2024-04-01'</span>
            <span class="hljs-keyword">and</span> starttime &lt; <span class="hljs-string">'2024-05-01'</span>
            <span class="hljs-keyword">and</span> <span class="hljs-keyword">status</span> = <span class="hljs-string">'cancelled'</span>
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> locid
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> <span class="hljs-keyword">count</span>(*) <span class="hljs-keyword">desc</span>
</code></pre>
<p>However, we also could make use of the extract function instead. The exact function lets us pull out pieces of PostgreSQL timestamps.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> locid, <span class="hljs-keyword">count</span>(*)
    <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">where</span> <span class="hljs-keyword">extract</span>(<span class="hljs-keyword">month</span> <span class="hljs-keyword">from</span> starttime) = <span class="hljs-number">4</span>
            <span class="hljs-keyword">and</span> <span class="hljs-keyword">extract</span>(<span class="hljs-keyword">year</span> <span class="hljs-keyword">from</span> starttime) = <span class="hljs-number">2024</span>
            <span class="hljs-keyword">and</span> <span class="hljs-keyword">status</span> = <span class="hljs-string">'cancelled'</span>
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> locid
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> <span class="hljs-keyword">count</span>(*) <span class="hljs-keyword">desc</span>
</code></pre>
<p>Here is a slightly more sophisticated usage of extract, where we seek to calculate total encounters per location per month in 2024.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> locid, <span class="hljs-keyword">extract</span>(<span class="hljs-keyword">month</span> <span class="hljs-keyword">from</span> starttime) <span class="hljs-keyword">as</span> q_month, <span class="hljs-keyword">count</span>(*)
    <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">where</span> <span class="hljs-keyword">extract</span>(<span class="hljs-keyword">year</span> <span class="hljs-keyword">from</span> starttime) = <span class="hljs-number">2024</span>
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> locid, q_month
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> locid, q_month
</code></pre>
<h2 id="heading-having">Having</h2>
<p>The <code>HAVING</code> clause differs from <code>WHERE</code> because it is applied after the grouping has already occurred. In other words, <code>HAVING</code> filters the output of the aggregation function, while <code>WHERE</code> filters what goes into the aggregation function. This example shows how we would output only locations that have more than 100 total encounters.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> locid, <span class="hljs-keyword">count</span>(*) <span class="hljs-keyword">as</span> <span class="hljs-string">"Total Encounters"</span>
    <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> locid
    <span class="hljs-keyword">having</span> <span class="hljs-keyword">count</span>(*) &gt; <span class="hljs-number">100</span>
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> locid
</code></pre>
<p>Here is another example where we list locations that have less than $1000 of revenue.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> <span class="hljs-keyword">name</span>, <span class="hljs-keyword">sum</span>(<span class="hljs-keyword">cost</span>) <span class="hljs-keyword">as</span> revenue
    <span class="hljs-keyword">from</span> locations loc
    <span class="hljs-keyword">inner</span> <span class="hljs-keyword">join</span> encounters enc
        <span class="hljs-keyword">on</span> loc.id = enc.locid
    <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> loc.name
    <span class="hljs-keyword">having</span> <span class="hljs-keyword">sum</span>(<span class="hljs-keyword">cost</span>) &lt; <span class="hljs-number">1000</span>
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> revenue
</code></pre>
<p>Just for science, here is an alternative query that uses a subquery instead of <code>HAVING</code>. Note that in the subquery version, we only have to describe our aggregation once <code>sum(cost)</code> while in the <code>HAVING</code> query we actually have to repeat it in the <code>SELECT</code>. Unfortunately, this is because unlike some other flavors of SQL, PostgreSQL does not support putting column alias names in the <code>HAVING</code> clause. This means that modifying our previous example to look like <code>HAVING revenue &lt; 1000</code> results in an error.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">select</span> <span class="hljs-keyword">name</span>, revenue
    <span class="hljs-keyword">from</span> (<span class="hljs-keyword">select</span> <span class="hljs-keyword">name</span>, <span class="hljs-keyword">sum</span>(<span class="hljs-keyword">cost</span>) <span class="hljs-keyword">as</span> revenue
            <span class="hljs-keyword">from</span> locations loc
            <span class="hljs-keyword">inner</span> <span class="hljs-keyword">join</span> encounters enc
                <span class="hljs-keyword">on</span> loc.id = enc.locid
            <span class="hljs-keyword">group</span> <span class="hljs-keyword">by</span> loc.name
        ) <span class="hljs-keyword">as</span> sub
    <span class="hljs-keyword">where</span> sub.revenue &lt; <span class="hljs-number">1000</span>
<span class="hljs-keyword">order</span> <span class="hljs-keyword">by</span> revenue
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this post we covered aggregation functions like <code>COUNT</code> and <code>SUM</code> alongside grouping tools like <code>GROUP BY</code> and <code>HAVING</code>. We also covered <code>EXTRACT</code>, which makes aggregating data based on dates more manageable. Combined, these tools allow us to ask more detailed questions about our data and retrieve grouped and aggregated metrics from our tables.</p>
]]></content:encoded></item><item><title><![CDATA[Beginners guide to modifying data in PostgreSQL]]></title><description><![CDATA[There are three primary modification operations we can perform in SQL, create, update, and delete. In this blog post, we'll cover examples and explanations of each.
We'll be using the following data model for the examples.

Inserting Data
Here's an e...]]></description><link>https://blog.robinhart.me/postgresql-modifying-data-for-beginners</link><guid isPermaLink="true">https://blog.robinhart.me/postgresql-modifying-data-for-beginners</guid><category><![CDATA[PostgreSQL]]></category><category><![CDATA[SQL]]></category><category><![CDATA[Beginner Developers]]></category><dc:creator><![CDATA[Nathan Robinson]]></dc:creator><pubDate>Sat, 27 Apr 2024 16:14:17 GMT</pubDate><content:encoded><![CDATA[<p>There are three primary modification operations we can perform in SQL, create, update, and delete. In this blog post, we'll cover examples and explanations of each.</p>
<p>We'll be using the following data model for the examples.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714231206690/2d60acf2-c6ca-4e5e-97ce-53eba144eca4.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-inserting-data">Inserting Data</h2>
<p>Here's an example of inserting two new rows into our patients table.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> patients (<span class="hljs-keyword">id</span>, firstname, lastname, phone)
<span class="hljs-keyword">values</span> (<span class="hljs-number">1</span>, <span class="hljs-string">'Terry'</span>, <span class="hljs-string">'Avila'</span>, <span class="hljs-string">'540-923-0088'</span>),
       (<span class="hljs-number">2</span>, <span class="hljs-string">'Dorothy'</span>, <span class="hljs-string">'Buckler'</span>, <span class="hljs-string">'334-422-6232'</span>)
</code></pre>
<p>Technically, the column specification is optional here since I'm providing values for all columns (example below), but I recommend including them for future proofing your inserts.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">insert</span> <span class="hljs-keyword">into</span> patients
<span class="hljs-keyword">values</span> (<span class="hljs-number">1</span>, <span class="hljs-string">'Terry'</span>, <span class="hljs-string">'Avila'</span>, <span class="hljs-string">'540-923-0088'</span>),
       (<span class="hljs-number">2</span>, <span class="hljs-string">'Dorothy'</span>, <span class="hljs-string">'Buckler'</span>, <span class="hljs-string">'334-422-6232'</span>)
</code></pre>
<p>One final note is that PostgreSQL does have a column level setting that we are not taking advantage of here called <code>SERIAL</code>. It automatically generates a unique, incrementing integer for each new row. I recommend using this when possible on your primary key field to simplify your inserts.</p>
<h2 id="heading-updating-data">Updating Data</h2>
<p>Below is an example of updating a single row's values for last name and phone number. Be very cautious when performing modifications of existing data in SQL, as simply forgetting the <code>where</code> clause will result in the entire table being modified. Performing routine backups of your database is essential, and you should always try your updates in a safe isolated environment before performing changes on a live system.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">update</span> patients
<span class="hljs-keyword">set</span> lastname = <span class="hljs-string">'Bucklar'</span>,
    phone = <span class="hljs-string">'334-123-9372'</span>
<span class="hljs-keyword">where</span> <span class="hljs-keyword">id</span> = <span class="hljs-number">2</span>;
</code></pre>
<p>Here is a contrived example where we wish to set the last name of one patient equal to the last name of a different patient. Notice how we're performing a scalar subquery here to retrieve the new value we wish to apply.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">update</span> patients pats
<span class="hljs-keyword">set</span> lastname = (
    <span class="hljs-keyword">select</span> lastname <span class="hljs-keyword">from</span> patients <span class="hljs-keyword">where</span> <span class="hljs-keyword">id</span> = <span class="hljs-number">0</span>
)
<span class="hljs-keyword">where</span> pats.id = <span class="hljs-number">1</span>;
</code></pre>
<p>PostgreSQL also has a bonus feature not present in standard SQL that allows you to perform the same query in a more readable way.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">update</span> patients pats1
<span class="hljs-keyword">set</span> lastname = pats2.lastname
<span class="hljs-keyword">from</span> (<span class="hljs-keyword">select</span> * <span class="hljs-keyword">from</span> patients <span class="hljs-keyword">where</span> <span class="hljs-keyword">id</span> = <span class="hljs-number">0</span>) pats2
<span class="hljs-keyword">where</span> pats1.id = <span class="hljs-number">1</span>;
</code></pre>
<h2 id="heading-deleting-data">Deleting Data</h2>
<p>Let's consider a scenario where we want to clear our some of our unnecessary data by deleting all patients who have no encounter associated with them.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">delete</span> <span class="hljs-keyword">from</span> patients
<span class="hljs-keyword">where</span> <span class="hljs-keyword">id</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> (
    <span class="hljs-keyword">select</span> patientid
        <span class="hljs-keyword">from</span> encounters
)
</code></pre>
<p>This performs a subquery on encounters, pulling out the patientid's on record for each encounter. We then check to see if a patient id is in that result set. If it isn't then we delete it.</p>
<p>Here is an alternative way to perform this query, which uses a correlated subquery.</p>
<pre><code class="lang-sql"><span class="hljs-keyword">delete</span> <span class="hljs-keyword">from</span> patients
<span class="hljs-keyword">where</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">exists</span> (
    <span class="hljs-keyword">select</span> <span class="hljs-number">1</span>
        <span class="hljs-keyword">from</span> encounters
    <span class="hljs-keyword">where</span> encounters.patientid = patients.id
)
</code></pre>
<p>These two queries have different performance characteristics, but it's worth noting that the query optimizer may choose to modify the behavior to benefit performance. In general, correlated subqueries are hard to optimize.</p>
]]></content:encoded></item></channel></rss>