Financial plan on PostgreSQL
This post shows how to build a financial plan on pure SQL using PostgreSQL database.
Along the way, the post teaches useful PostgreSQL-specific and common SQL concepts. Learn how pure SQL allows getting away from overengineering.
Tasking
A financial plan is an easy but, same time, comprehensive concept. Definition of this thing merits a separate post or even a book. But this post, for simplicity and the purpose of the tech aspects demonstrating, doesn’t go too far. Let’s agree on the following definition. A financial plan is a concept that builds an cumulative profit based on coming in and going out cash flows.
The system takes an income (that can be a monthly salary, paid dividends, etc.) and expenses (taxes, utility bills, expenses for food, etc.). Then, using that data, it calculates profit for each month during a year and accumulates to the total. It results in a table that’s understandable for everyone. Even though it’s very primitive, it’s still useful as it tells when you can afford the next video game, a car, or even the house of your dream.
Setting up
Create PostgreSQL DB and log into its console:
$ createdb finance_plan
$ psql -d finance_plan
All the following commands and their outputs run inside the DB console. Also, the post omits details related to multi-tenancy and performance issues for simplicity of the article aspects demonstration.
Create incomes
and expenses
tables where the corresponding monthly and quarterly amounts are defined. The type
column signifies whether it’s monthly or quarterly.
create table incomes (amount int, type varchar);
create table expenses (amount int, type varchar);
These are all tables our financial plan needs.
Add some data to the tables that will be used by the code that generates the plan:
insert into incomes (amount, type) values (2000, 'monthly');
insert into incomes (amount, type) values (500, 'quarterly');
insert into expenses (amount, type) values (500, 'monthly');
insert into expenses (amount, type) values (1000, 'quarterly');
Literally, that defines monthly income $2,000 (assume it’s our salary) and quarterly income $500 (that can be some bonuses). And there are two expenses: $500 monthly, e.g. utility bills and $1,000 quarterly, e.g. a payment for education.
The plan as a table
Generating the plan can be done within one SQL query that generates a table with columns: Month, Income, Expenses, Profit, Cumulative Profit. A row represents calculations for a specific month. There are as many rows as months - 12.
To generate 12 rows that represent a month number we can use a series:
select * from generate_series(1, 12) as month;
month
═══════
1
2
3
4
5
6
7
8
9
10
11
12
(12 rows)
The ending as month
is an alias that gives month
name to the column name.
Without aliasing, PostgreSQL generates the name automatically.
For this select the autogenerated name is generate_series
(check it running this query - select * from generate_series(1, 12)
).
This name is not very clear and confusing. The alias makes it human-friendly.
We need these values later inside another query. Writing this piece of SQL as a sub-query would end with a complex query. To prevent that, create a short-cut, kinda virtual table, that behaves like a real one when selecting data. PostgreSQL has a special concept for that - views.
create view months as (select * from generate_series(1, 12) as month);
This is how it can be used by select
:
select month from months;
month
═══════
1
2
3
4
5
6
7
8
9
10
11
12
(12 rows)
Moving forward, let’s select the sum of monthly expenses and incomes:
select month, sum(monthly_expenses.amount) as monthly_expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
group by month
order by month;
month │ monthly_expenses
═══════╪══════════════════
1 │ 500
2 │ 500
3 │ 500
4 │ 500
5 │ 500
6 │ 500
7 │ 500
8 │ 500
9 │ 500
10 │ 500
11 │ 500
12 │ 500
(12 rows)
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
means inner joining only monthly expenses and register them with name monthly_expenses
.
Later on, that allows us to use this logical and clear name in the query.
We could also write inner join
instead of join
. But we omit it as a redundant thing that makes the construction verbose.
There can be many expenses in one month and we should instruct PostgreSQL on what to do with that situation.
In this case, we summarize them. The combination group by month
with the aggregate sum
function does the trick.
order by month
tells to sort rows by month. If not specify that, the rows will be listed in a chaotic order messing up the financial plan.
The as
keyword for aliases is optional in all the places above. And the query can be written like this:
select month, sum(monthly_expenses.amount) monthly_expenses
from months
join expenses monthly_expenses on monthly_expenses.type = 'monthly'
group by month
order by month;
The query definitely got a bit shorter. But agree or not, it became more confusing. That’s why we use it all over the post. Though, it’s completely ok to omit it in production.
Using the same idea with joining, we select quarterly expenses:
select month, sum(monthly_expenses.amount) as monthly_expenses, sum(quarterly_expenses.amount) as quarterly_expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
left join expenses as quarterly_expenses on quarterly_expenses.type = 'quarterly' and month % 3 = 0
group by month
order by month;
month │ monthly_expenses │ quarterly_expenses
═══════╪══════════════════╪════════════════════
1 │ 500 │ ¤
2 │ 500 │ ¤
3 │ 500 │ 1000
4 │ 500 │ ¤
5 │ 500 │ ¤
6 │ 500 │ 1000
7 │ 500 │ ¤
8 │ 500 │ ¤
9 │ 500 │ 1000
10 │ 500 │ ¤
11 │ 500 │ ¤
12 │ 500 │ 1000
(12 rows)
Note, how quarterly_expenses
are joined now.
The join uses an additional condition month % 3 = 0
to select only 4 months with quarterly expenses: 3, 6, 9, 12.
All of these numbers are multiplied by 3.
In case there are no expenses we don’t want the rows crossed out by the select, so it uses left join
instead of inner
one.
To better feel left join need check out how the result looks like using inner join:
select month, sum(monthly_expenses.amount) as monthly_expenses, sum(quarterly_expenses.amount) as quarterly_expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
join expenses as quarterly_expenses on quarterly_expenses.type = 'quarterly' and month % 3 = 0
group by month
order by month;
month │ monthly_expenses │ quarterly_expenses
═══════╪══════════════════╪════════════════════
3 │ 500 │ 1000
6 │ 500 │ 1000
9 │ 500 │ 1000
12 │ 500 │ 1000
(4 rows)
That’s not exactly what we want, agreed?
It’s time to summarize all expenses (monthly + quarterly) for each month:
select
month,
sum(monthly_expenses.amount) as monthly_expenses,
sum(quarterly_expenses.amount) as quarterly_expenses,
sum(monthly_expenses.amount) + sum(quarterly_expenses.amount) as expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
left join expenses as quarterly_expenses on quarterly_expenses.type = 'quarterly' and month % 3 = 0
group by month
order by month;
month │ monthly_expenses │ quarterly_expenses │ expenses
═══════╪══════════════════╪════════════════════╪══════════
1 │ 500 │ ¤ │ ¤
2 │ 500 │ ¤ │ ¤
3 │ 500 │ 1000 │ 1500
4 │ 500 │ ¤ │ ¤
5 │ 500 │ ¤ │ ¤
6 │ 500 │ 1000 │ 1500
7 │ 500 │ ¤ │ ¤
8 │ 500 │ ¤ │ ¤
9 │ 500 │ 1000 │ 1500
10 │ 500 │ ¤ │ ¤
11 │ 500 │ ¤ │ ¤
12 │ 500 │ 1000 │ 1500
(12 rows)
Note, the ¤
symbol in the table. That’s not what we want. Only the months without quarterly expenses have these weird symbols.
They mean null
(this is how my PostgreSQL console is configured, it’s possible to configure to show null
in some other way).
sum(quarterly_expenses.amount)
gives null
on the rows with empty joined quarterly expenses and if sum it up with some other number it results in null
as well.
But we don’t need that, this null
should be 0
in fact. Use coalesce
for that:
select
month,
sum(monthly_expenses.amount) as monthly_expenses,
coalesce(sum(quarterly_expenses.amount), 0) as quarterly_expenses,
sum(monthly_expenses.amount) + coalesce(sum(quarterly_expenses.amount), 0) as expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
left join expenses as quarterly_expenses on quarterly_expenses.type = 'quarterly' and month % 3 = 0
group by month
order by month;
month │ monthly_expenses │ quarterly_expenses │ expenses
═══════╪══════════════════╪════════════════════╪══════════
1 │ 500 │ 0 │ 500
2 │ 500 │ 0 │ 500
3 │ 500 │ 1000 │ 1500
4 │ 500 │ 0 │ 500
5 │ 500 │ 0 │ 500
6 │ 500 │ 1000 │ 1500
7 │ 500 │ 0 │ 500
8 │ 500 │ 0 │ 500
9 │ 500 │ 1000 │ 1500
10 │ 500 │ 0 │ 500
11 │ 500 │ 0 │ 500
12 │ 500 │ 1000 │ 1500
(12 rows)
Here we go! Now it has correctly calculated sum of expenses for each month.
There is one annoying thing here, though. Notice the repeated construction above coalesce(sum(quarterly_expenses.amount), 0)
.
PostgreSQL doesn’t allow us to use the quarterly_expenses
column alias in the subsequent column. In other words, this query is not valid:
select
month,
sum(monthly_expenses.amount) as monthly_expenses,
coalesce(sum(quarterly_expenses.amount), 0) as quarterly_expense,
sum(monthly_expenses.amount) + quarterly_expense as expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
left join expenses as quarterly_expenses on quarterly_expenses.type = 'quarterly' and month % 3 = 0
group by month
order by month;
Here we use quaterly_expense
alias for the column so that the name is different from the joined monthly_expenses
collection. If run this query it gives the following error:
ERROR: column "quarterly_expense" does not exist
LINE 5: sum(monthly_expenses.amount) + quarterly_expense as expe...
To overcome that we can use lateral join
, that allows to aggregate the joined data once and define an alias for it that can be reused in the select
statement:
select
month,
sum(monthly_expenses.amount) as monthly_expenses,
quarterly_expenses.amount as quarterly_expenses,
sum(monthly_expenses.amount) + quarterly_expenses.amount as expenses
from months
join expenses as monthly_expenses on monthly_expenses.type = 'monthly'
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
group by month, quarterly_expenses.amount
order by month;
In short, it aggregates the sub-select into one row with one column amount
that’s easy to reuse within the main query.
Note, the on true
at the end of the lateral join
.
It specifies a condition on when to join the data. We want to join it always. That’s why it’s on true
.
Even if there are no quarterly expenses we need that 0, to sum up the monthly expense amount.
Also, PostgreSQL forces us to add quarterly_expenses.amount
into grouping as theoretically there can be many rows.
Of course, we know that it’s not true because there will be always one row.
But PostgreSQL is not aware of that. There is no other way to instruct it than to change the grouping.
Make the joined monthly expenses to follow the DRY (don’t repeat yourself) idea as well:
select
month,
monthly_expenses.amount as monthly_expenses,
quarterly_expenses.amount as quarterly_expenses,
monthly_expenses.amount + quarterly_expenses.amount as expenses
from months
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'monthly') as monthly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
group by month, quarterly_expenses.amount, monthly_expenses.amount
order by month;
Now we use coalesce
for monthly expenses too. There might be no monthly expenses. The query becomes universal this way.
Same way join incomes:
select
month,
monthly_expenses.amount as monthly_expenses,
quarterly_expenses.amount as quarterly_expenses,
monthly_expenses.amount + quarterly_expenses.amount as expenses,
monthly_incomes.amount as monthly_incomes,
quarterly_incomes.amount as quarterly_incomes,
monthly_incomes.amount + quarterly_incomes.amount as incomes
from months
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'monthly') as monthly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'monthly') as monthly_incomes on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'quarterly' and month % 3 = 0) as quarterly_incomes on true
group by month, monthly_expenses.amount, quarterly_expenses.amount, monthly_incomes.amount, quarterly_incomes.amount
order by month;
month │ monthly_expenses │ quarterly_expenses │ expenses │ monthly_incomes │ quarterly_incomes │ incomes
═══════╪══════════════════╪════════════════════╪══════════╪═════════════════╪═══════════════════╪═════════
1 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
2 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
3 │ 500 │ 1000 │ 1500 │ 2000 │ 500 │ 2500
4 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
5 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
6 │ 500 │ 1000 │ 1500 │ 2000 │ 500 │ 2500
7 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
8 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
9 │ 500 │ 1000 │ 1500 │ 2000 │ 500 │ 2500
10 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
11 │ 500 │ 0 │ 500 │ 2000 │ 0 │ 2000
12 │ 500 │ 1000 │ 1500 │ 2000 │ 500 │ 2500
(12 rows)
We are too close to our expected resulting table. Remember, the desired columns of the table are Month, Income, Expenses, Profit, Cumulative Profit. Let’s remove those temp columns monthly_expenses, quarterly_expenses, monthly_incomes, quarterly_incomes. They are good for details and for debugging. But we are sure it has no mistakes now. So we simplify it:
select
month,
monthly_expenses.amount + quarterly_expenses.amount as expenses,
monthly_incomes.amount + quarterly_incomes.amount as incomes
from months
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'monthly') as monthly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'monthly') as monthly_incomes on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'quarterly' and month % 3 = 0) as quarterly_incomes on true
group by month, monthly_expenses.amount, quarterly_expenses.amount, monthly_incomes.amount, quarterly_incomes.amount
order by month;
month │ expenses │ incomes
═══════╪══════════╪═════════
1 │ 500 │ 2000
2 │ 500 │ 2000
3 │ 1500 │ 2500
4 │ 500 │ 2000
5 │ 500 │ 2000
6 │ 1500 │ 2500
7 │ 500 │ 2000
8 │ 500 │ 2000
9 │ 1500 │ 2500
10 │ 500 │ 2000
11 │ 500 │ 2000
12 │ 1500 │ 2500
(12 rows)
It’s easy to add profit column that’s incomes - expenses
:
select
month,
monthly_expenses.amount + quarterly_expenses.amount as expenses,
monthly_incomes.amount + quarterly_incomes.amount as incomes,
(monthly_incomes.amount + quarterly_incomes.amount) - (monthly_expenses.amount + quarterly_expenses.amount) as profit
from months
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'monthly') as monthly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'monthly') as monthly_incomes on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'quarterly' and month % 3 = 0) as quarterly_incomes on true
group by month, monthly_expenses.amount, quarterly_expenses.amount, monthly_incomes.amount, quarterly_incomes.amount
order by month;
month │ expenses │ incomes │ profit
═══════╪══════════╪═════════╪════════
1 │ 500 │ 2000 │ 1500
2 │ 500 │ 2000 │ 1500
3 │ 1500 │ 2500 │ 1000
4 │ 500 │ 2000 │ 1500
5 │ 500 │ 2000 │ 1500
6 │ 1500 │ 2500 │ 1000
7 │ 500 │ 2000 │ 1500
8 │ 500 │ 2000 │ 1500
9 │ 1500 │ 2500 │ 1000
10 │ 500 │ 2000 │ 1500
11 │ 500 │ 2000 │ 1500
12 │ 1500 │ 2500 │ 1000
(12 rows)
To get the cumulative profit column that summarizes the current row profit with the previous one we should use a window function.
That’s another concept of PostgreSQL that can capture the current row and apply an aggregate function on the other rows of the selecting result.
In our case, we want to summarize the current row profit with the previous one. The piece of SQL is pretty easy: sum(profit) over (order by month)
.
But profit
is a dynamic value calculated on the fly using another sum
function. PostgreSQL doesn’t allow to nest them.
To unblock this constraint, we create another view from the current result:
create view monthly_profits as (
select
month,
monthly_expenses.amount + quarterly_expenses.amount as expenses,
monthly_incomes.amount + quarterly_incomes.amount as incomes,
(monthly_incomes.amount + quarterly_incomes.amount) - (monthly_expenses.amount + quarterly_expenses.amount) as profit
from months
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'monthly') as monthly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from expenses where type = 'quarterly' and month % 3 = 0) as quarterly_expenses on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'monthly') as monthly_incomes on true
join lateral (select coalesce(sum(amount), 0) as amount from incomes where type = 'quarterly' and month % 3 = 0) as quarterly_incomes on true
group by month, monthly_expenses.amount, quarterly_expenses.amount, monthly_incomes.amount, quarterly_incomes.amount
order by month
);
Now, we can easily select the result:
select * from monthly_profits;
month │ expenses │ incomes │ profit
═══════╪══════════╪═════════╪════════
1 │ 500 │ 2000 │ 1500
2 │ 500 │ 2000 │ 1500
3 │ 1500 │ 2500 │ 1000
4 │ 500 │ 2000 │ 1500
5 │ 500 │ 2000 │ 1500
6 │ 1500 │ 2500 │ 1000
7 │ 500 │ 2000 │ 1500
8 │ 500 │ 2000 │ 1500
9 │ 1500 │ 2500 │ 1000
10 │ 500 │ 2000 │ 1500
11 │ 500 │ 2000 │ 1500
12 │ 1500 │ 2500 │ 1000
(12 rows)
And finally, add the cumulative profit column:
select month "Month", expenses "Expenses", incomes "Income", profit "Profit", sum(profit) over (order by month) "Cumulative Profit" from monthly_profits;
Month │ Expenses │ Income │ Profit │ Cumulative Profit
═══════╪══════════╪════════╪════════╪═════════════════════
1 │ 500 │ 2000 │ 1500 │ 1500
2 │ 500 │ 2000 │ 1500 │ 3000
3 │ 1500 │ 2500 │ 1000 │ 4000
4 │ 500 │ 2000 │ 1500 │ 5500
5 │ 500 │ 2000 │ 1500 │ 7000
6 │ 1500 │ 2500 │ 1000 │ 8000
7 │ 500 │ 2000 │ 1500 │ 9500
8 │ 500 │ 2000 │ 1500 │ 11000
9 │ 1500 │ 2500 │ 1000 │ 12000
10 │ 500 │ 2000 │ 1500 │ 13500
11 │ 500 │ 2000 │ 1500 │ 15000
12 │ 1500 │ 2500 │ 1000 │ 16000
(12 rows)
See how to give complex names to column aliases using quotes "
. Also, we omit as
keyword as it reads well now with these quotes.
That’s the complete financial plan. The Cumulative Profit column allows understand when we can afford to buy something expensive. Say, if we want to buy a car that costs $10,000 we can do that only in month 8. We have cash of more than $10,000 on that month equals $11,000.
In the last month of the year, we will have $16,000 on hand unless spending money on things outside the plan.
Conclusion
The post shows the ease of SQL usage to solve business-related tasks. Views allow dividing tasks into subtasks. That can solve tasks with any complexity.
One could use Ruby/Rails facilities to solve it. But the code would be less performant and most likely could have more lines. Additionally, that code would require tests just to confirm that the code is valid. Pure SQL has no chance of being invalid because it would not “compile”. SQL knowledge is crucial within web software development. It helps to build easier, maintainable solutions without overengineering.
Later on, the solution can be reused in a Rails application. So that we can have a website that helps to generate a personal financial plan. But that’s a story for another article. Keep posted!