Population Dynamics and Sustainability

import matplotlib
if not hasattr(matplotlib.RcParams, "_get"):
    matplotlib.RcParams._get = dict.get

Population Dynamics and Sustainability#

It is absolutely central, in connection with political decision-making for society, to be able to predict how a given population will develop going forward. It is, of course, important to look at a country’s population and be able to plan day-care institutions, schools, elder care, etc., and to assess the economic sustainability of the demographic profile. Exactly the same considerations are meaningful for animal species. For example, it is necessary to be able to describe the dynamics of a fish population before one starts allocating fishing quotas. These are the same issues, just in very different contexts.

In this notebook, we will look at modeling populations using matrices and linear algebra. The basic ideas go back to Patrick Leslie in 1945, who proposed studying populations by dividing them into age classes and drawing on knowledge about the classes’ individual fertility and survival rates. As an example, he looked at the brown rat population. Taken together, this yields a discrete, dynamical system that can be analyzed with eigenvalues and eigenvectors.

Below we will start with small and simple models based on rabbit populations and the Fibonacci numbers. The models are expanded, fertility rates and survival rates are introduced, and we look at how competing species can be modeled. Finally, we arrive at a model for the development of the Danish population with conditions and rates based on data from Statistics Denmark.

Along the way we will highlight essential topics from linear algebra, mathematical modeling with Leslie matrices, eigenvalues and eigenvectors, iteration and convergence, and we will take a few small detours to illuminate computational elements, including looking at control structures and complexity.

Program#

Warm-up: The Fibonacci numbers revisited
Getting real - birth and survival
Competing species
The Danish population

Setup#

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
try:
    from ipywidgets import interact, FloatSlider, Layout
except ImportError:
    import micropip
    await micropip.install("ipywidgets")
    from ipywidgets import interact, FloatSlider, Layout

import matplotlib
if not hasattr(matplotlib.RcParams, "_get"):
    matplotlib.RcParams._get = dict.get

from matplotlib.patches import Circle
from IPython.display import clear_output

1. Warm-up: The Fibonacci numbers revisited#

Fibonacci was an Italian mathematician in the 13th century; he is also known as Leonardo Figlio di Bonacci. Through travels to North Africa he became acquainted with the Hindu-Arabic numeral system (the digits 0-9, decimals), which he brought to Europe in his book Liber Abaci from 1203. The new numbers replaced Roman numerals and triggered a revolution for trade and banking; even back then, mathematics was a source of power and wealth.

Fibonacci had a strong focus on algorithms, equations, and practical problem solving, and it is entirely reasonable to view him as a pioneer of computational thinking - long before the birth of the computer.

In Liber Abaci, Fibonacci carries out a thought experiment about the development of a rabbit population in isolated surroundings. The starting point is a rabbit pair and the mathematical model that every adult pair produces offspring in the form of a new, young pair each month. Young become adults in one month, and there is no mortality.

We choose to count female rabbits rather than pairs. We start with one adult female rabbit, and in the context of population dynamics we consider an age-structured population model where we distinguish between two age classes:

\(\boldsymbol{x}_k[0]\): number of young female rabbits after \(k\) months,
\(\boldsymbol{x}_k[1]\): number of adult female rabbits after \(k\) months.

At \(k=0\) we have \(\boldsymbol{x}_0[0] = 0\) and \(\boldsymbol{x}_0[1] = 1.\) The dynamics are described by

\[\begin{split}\begin{align*} \boldsymbol{x}_{k+1}[0] &= \boldsymbol{x}_{k}[1] &\qquad \text{(each adult rabbit gives rise to a young rabbit next month)},\\ \boldsymbol{x}_{k+1}[1] &= \boldsymbol{x}_{k}[0] + \boldsymbol{x}_{k}[1] & \qquad \text{(each young rabbit becomes an adult next month, and the current adults survive)}. \end{align*}\end{split}\]

If we introduce the matrix

\[\begin{split} A = \begin{bmatrix}0 & 1\\[4pt]1 & 1\end{bmatrix} \end{split}\]

then the population is described by the discrete dynamical system

\[\begin{split} \begin{align*} \boldsymbol{x}_{k+1} = A \boldsymbol{x}_k, \quad k = 0,1,2,\ldots\;\;, \quad \boldsymbol{x}_0 = \begin{bmatrix} 0 \\ 1 \end{bmatrix} \end{align*} \end{split}\]

Thus we can follow how the number of young and adults develops month by month.

We can describe the development by iterating the matrix \(A\) as

\[ \boldsymbol{x}_1 = A \boldsymbol{x}_0, \quad \boldsymbol{x}_2 = A \boldsymbol{x}_1 = A^2 \boldsymbol{x}_0, \quad \dots, \quad \boldsymbol{x}_{k+1} = A \boldsymbol{x}_k = A^{k+1} \boldsymbol{x}_0. \]

If we only look at the number of adult female rabbits \(F_k = \boldsymbol{x}_k[1]\), we recover the recursion

\[ F_{k+1} = F_k + F_{k-1}, \quad k = 1,2,3,\ldots\;\;, \]

which, with the start \(F_0 = F_1 = 1\), gives exactly the Fibonacci numbers \(1,1,2,3,5,8,\ldots\)

The matrix \(A\) is therefore often called the Fibonacci matrix, since each iteration corresponds to one month in the model, and the adult population follows the Fibonacci sequence.

Notice how the first and second row in \(A\) describe, respectively, next generation’s young and adults. Interpret likewise the columns in \(A\) in the context of rabbit populations.

Iteration and for-loops in Python#

Loops are a so-called control structure that allows us to run through a sequence of elements and, for each element, perform a given action.

Before we start implementing the Fibonacci matrix and iterating in Python, we do a simple warm-up exercise.

Read the code below and answer the question before you run it in Python.

What do you expect will be printed? Run the code and compare with your prediction.

k = 1
for i in range(5):
    k = k + i
    print("i =", i, ", k =", k)

i = 0 , k = 1
i = 1 , k = 2
i = 2 , k = 4
i = 3 , k = 7
i = 4 , k = 11

What happens if we change range(5) to range(1,6)?

Complete the code cell below so that a for-loop is used to compute and print the first \(10\) Fibonacci numbers. NB: In the code we use the variables F0 and F1 to continuously compute and overwrite the two most recent Fibonacci numbers. It can seem a bit confusing, because one might think that F0 always represents the first Fibonacci number and F1 the next. Here, however, we only use them as placeholders for the newest values in the sequence - not as fixed indices.

F0 = "INSERT CODE HERE"
F1 = "INSERT CODE HERE"

for i in range("INSERT CODE HERE"):
    print("INSERT CODE HERE")
    F0, F1 = "INSERT CODE HERE" , "INSERT CODE HERE"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 4
      1 F0 = "INSERT CODE HERE"
      2 F1 = "INSERT CODE HERE"
----> 4 for i in range("INSERT CODE HERE"):
      5     print("INSERT CODE HERE")
      6     F0, F1 = "INSERT CODE HERE" , "INSERT CODE HERE"

TypeError: 'str' object cannot be interpreted as an integer

Matrix iteration#

We have seen that the rabbit population can be described by

\[\begin{split} \boldsymbol{x}_{k+1} = A \boldsymbol{x}_k = A^{k+1}\boldsymbol{x}_0, \quad A = \begin{bmatrix}0 & 1\\ 1 & 1\end{bmatrix}. \end{split}\]

where the initial population is

\[\begin{split} \boldsymbol{x}_0 = \begin{bmatrix}0\\ 1\end{bmatrix} \quad \text{(0 young, 1 adult).} \end{split}\]

Complete the code cell below so that the population for the first 10 months is computed by repeatedly applying \(\boldsymbol{x}_{k+1} = A \boldsymbol{x}_k\). Print the number of young, adults, and total for each month.

A = "INSERT CODE HERE"

initial_population = "INSERT CODE HERE"

num_months = "INSERT CODE HERE"

print("Month | Young | Adults | Total")
population = initial_population.copy()
for month in range(num_months):
    print(f"{month:5d} | {population[0]:4d} | {population[1]:6d} | {population.sum():3d}")
    # Update the population
    population = "INSERT CODE HERE"

Month | Young | Adults | Total

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[4], line 8
      5 num_months = "INSERT CODE HERE"
      7 print("Month | Young | Adults | Total")
----> 8 population = initial_population.copy()
      9 for month in range(num_months):
     10     print(f"{month:5d} | {population[0]:4d} | {population[1]:6d} | {population.sum():3d}")

AttributeError: 'str' object has no attribute 'copy'

In the following code cell we want to show the development of adult rabbits in a plot. We use the list adult_list to continuously store the number in each iteration.

Complete the code cell below so that the number of adult rabbits is plotted for the first 10 months using matrix iteration.

A = "INSERT CODE HERE"

initial_population = "INSERT CODE HERE"

num_months = "INSERT CODE HERE"

# empty list for the number of adult rabbits
adult_list = []

population = initial_population.copy()
for month in range(num_months):
    # add the number of adult rabbits to the list
    adult_list.append(population[1])

    # Update the population
    population = "INSERT CODE HERE"


# Plot number of adult rabbits
plt.figure(figsize=(7,4))
plt.plot(range(num_months), adult_list, 'o-', color='green')
plt.xlabel("Month")
plt.ylabel("Number of adult rabbits")
plt.title("Development of adult rabbits (month 0-9)")
plt.grid(True)
plt.show()

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[5], line 10
      7 # empty list for the number of adult rabbits
      8 adult_list = []
---> 10 population = initial_population.copy()
     11 for month in range(num_months):
     12     # add the number of adult rabbits to the list
     13     adult_list.append(population[1])

AttributeError: 'str' object has no attribute 'copy'

Use that \(\boldsymbol{x}_{k} = A^{k}\boldsymbol{x}_0\) to find the population in month 9 without iterating month by month.
\(A^{k}\) can be made with np.linalg.matrix_power(A,k).

# INSERT CODE HERE

Compare your result with the task where you determined \(\boldsymbol{x}_{k+1}\) by repeated \(\boldsymbol{x}_{k+1}=A\boldsymbol{x}_k\).

Eigenvalues, singular values, and asymptotic development#

An eigenvalue \(\lambda \in \mathbb{R}\) and a corresponding eigenvector \(\boldsymbol{v} \neq 0\) satisfy the eigenvalue problem

\[ A \boldsymbol{v} = \lambda \boldsymbol{v}. \]

This means that when we apply the matrix \(A\) to the vector \(\boldsymbol{v}\), the result is only a scaling of \(\boldsymbol{v}\) by \(\lambda.\)

Geometrically, \(A v\) is parallel to \(\boldsymbol{v}\), but the length changes by the factor \(\lambda\).

An eigenvalue \(\lambda\) can be found using the determinant and satisfies the equation

\[ \det(A - \lambda I) = 0. \]

For the Fibonacci matrix \(A\) this means that

\[\begin{split} \det\begin{pmatrix}-\lambda & 1\\ 1 & 1-\lambda\end{pmatrix} = -\lambda(1-\lambda) - 1 = 0. \end{split}\]

Solve the quadratic equation for \(\lambda\).

See that there is a positive solution, \(\lambda = \lambda_1,\) and a negative solution \(\lambda = \lambda_2,\) and that \(\lambda_1 > |\lambda_2|.\)

lambda1 = "INSERT CODE HERE"
lambda2 = "INSERT CODE HERE"

We now move to a geometric interpretation.

Below you will find the interactive function unit_vector_mapping(). It looks extensive, but you do not need to understand every step.

def unit_vector_mapping(A, slider_width='80%', figsize=(6, 6)):
    """
    Interactive visualization of unit vectors x and their image y = Ax
    for a 2x2 matrix A.
    """

    # Define unit vector as a function of angle t
    def x(t): 
        return np.array([np.cos(t), np.sin(t)])  # Point on the unit circle

    # Define image vector y = A@x(t)
    def y(t): 
        return A @ x(t)

    # Create points for the whole circle (from 0 to 2π)
    ts = np.linspace(0, 2*np.pi, 361)

    # Points in the figure to create suitable axis limits
    circle_pts = np.column_stack([x(tt) for tt in ts]).T   # Points on the unit circle
    image_pts = np.column_stack([y(tt) for tt in ts]).T    # Points after the transformation
    all_pts = np.vstack([circle_pts, image_pts])           
    lim = 1.15 * np.max(np.abs(all_pts))                   

    # Function that draws vectors for a given t
    def plot_vectors(t):
        clear_output(wait=True)  
        fig, ax = plt.subplots(figsize=figsize)

        # plot appearance
        ax.set_aspect('equal')   
        ax.grid(True, linestyle='--', linewidth=0.4, alpha=0.5)
        ax.set_xlim(-lim, lim)
        ax.set_ylim(-lim, lim)

        # Draw the unit circle
        ax.plot(circle_pts[:, 0], circle_pts[:, 1], color='lightgrey', linewidth=1.5)
        ax.add_patch(Circle((0, 0), radius=1, edgecolor='lightgrey', facecolor='none', linewidth=1.5))

        # Pick point on the circle and its image
        v = x(t)     
        Av = y(t)    

        # Helper function to draw arrows for vectors
        def draw_arrow(ax, vec, color, label, z):
            ax.quiver(0, 0, vec[0], vec[1], angles='xy', scale_units='xy', scale=1,
                      color=color, width=0.012, zorder=z)
            # Write coordinates at the tip of the arrow
            ax.text(vec[0], vec[1], f'  {label}{tuple(np.round(vec, 2))}',
                    va='bottom', ha='left', fontsize=9, color=color)

        # Draw the vectors x and Ax
        draw_arrow(ax, v, 'tab:blue', 'x ', 3)
        draw_arrow(ax, Av, 'tab:red', 'Ax ', 4)

        # Title with current angle value
        ax.set_title(rf"$t={t:.2f}$")
        plt.show()

    # Slider to change angle t interactively
    slider = FloatSlider(
        value=0,
        min=0,
        max=2*np.pi,
        step=0.01,
        description='t',
        continuous_update=True,
        readout_format='.2f',
        layout=Layout(width=slider_width)
    )

    # Connect the slider with the visualization function
    interact(plot_vectors, t=slider)

Run the following example.

# example
A = np.array([[0, 1],
              [1, 1]])
unit_vector_mapping(A)

Using unit_vector_mapping(), read off two (linearly) independent vectors \(\boldsymbol{v}=\boldsymbol{v}_1\) and \(\boldsymbol{v}=\boldsymbol{v}_2\) for which \(A \boldsymbol{v}\) is parallel to \(\boldsymbol{v}\).

These are the eigenvectors. Note that if \(\boldsymbol{v}\) is an eigenvector, then \(c\boldsymbol{v}\) is also one for any number \(c\in\mathbb{R}\). In particular, \(-\boldsymbol{v}\) is also an eigenvector with the same eigenvalue.

Similarly, read off the corresponding eigenvalues \(\lambda=\lambda_1\) and \(\lambda=\lambda_2.\)

Define them in the code cell below using np.array().

v1 = "INSERT VALUES HERE"
lamb1 = "INSERT VALUES HERE"

v2 = "INSERT VALUES HERE"
lamb2 = "INSERT VALUES HERE"

Compare the read-off eigenvalues with the exact values \(\lambda_1,\lambda_2\) found above.

Eigenvectors corresponding to an eigenvalue \(\lambda\) can be found by solving the equation

\[ (A-\lambda I )\boldsymbol{v} = 0. \]

In the code below we use our insight from SVD to find a unit vector \(\boldsymbol{w}_1,\) which (approximately) is an eigenvector.

M = A - lambda1 * np.eye(A.shape[0])

# Compute the SVD of M
U, S, Vh = np.linalg.svd(M)

# We remember that the singular values in S are decreasing, so the last one must reflect the null space of M
w1 = Vh.T[:,1]

---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
Cell In[11], line 1
----> 1 M = A - lambda1 * np.eye(A.shape[0])
      3 # Compute the SVD of M
      4 U, S, Vh = np.linalg.svd(M)

UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U16'), dtype('float64')) -> None

Compare the read-off \(\boldsymbol{v}_1\) with the found vector \(\boldsymbol{w}_1\).

print(w1)
print(v1)
print(S)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 print(w1)
      2 print(v1)
      3 print(S)

NameError: name 'w1' is not defined

In Python, the eigenvalues and their corresponding eigenvectors can be determined with np.linalg.eig(A).

lamb, V = np.linalg.eig(A)
print("Vector of eigenvalues:\n",lamb)
print("Matrix of eigenvectors as columns:\n",V)

Vector of eigenvalues:
 [-0.61803399  1.61803399]
Matrix of eigenvectors as columns:
 [[-0.85065081 -0.52573111]
 [ 0.52573111 -0.85065081]]

Compare the eigenvalues with the theoretical values you determined earlier.

For our applications we want the eigenvalues in descending order.

# 1. Sort by descending eigenvalue
idx = np.argsort(lamb)[::-1]
lamb = lamb[idx]
V = V[:, idx]

# define eigenvalues and eigenvectors
lamb1 = lamb[0]
lamb2 = lamb[1]
v1 = V[:,0]
v2 = V[:,1]
print("lambda1 = ",lamb1)
print("lambda2 = ",lamb2)
print("v1 = ",v1)
print("v2 = ",v2)

lambda1 =  1.618033988749895
lambda2 =  -0.6180339887498949
v1 =  [-0.52573111 -0.85065081]
v2 =  [-0.85065081  0.52573111]

Asymptotic development#

We can write the initial vector as a linear combination of the eigenvectors \(v_1\) and \(v_2\), so let

\[ \boldsymbol{x}_0 = c_1 \boldsymbol{v}_1 + c_2 \boldsymbol{v}_2. \]

When we apply the matrix \(A\) to \(\boldsymbol{x}_0\), we get, by linearity, that

\[ \boldsymbol{x}_1 = A \boldsymbol{x}_0 = A(c_1 \boldsymbol{v}_1 + c_2 \boldsymbol{v}_2) = c_1 A \boldsymbol{v}_1 + c_2 A \boldsymbol{v}_2 = c_1 \lambda_1 \boldsymbol{v}_1 + c_2 \lambda_2 \boldsymbol{v}_2. \]

Applying \(A\) again yields

\[ \boldsymbol{x}_2 = A \boldsymbol{x}_1 = A(c_1 \lambda_1 \boldsymbol{v}_1 + c_2 \lambda_2 \boldsymbol{v}_2) = c_1 \lambda_1 A \boldsymbol{v}_1 + c_2 \lambda_2 A \boldsymbol{v}_2 = c_1 \lambda_1^2 \boldsymbol{v}_1 + c_2 \lambda_2^2 \boldsymbol{v}_2. \]

Repeating the process, for the \(k\)th iteration we get

\[ \boldsymbol{x}_k = c_1 \lambda_1^k \boldsymbol{v}_1 + c_2 \lambda_2^k \boldsymbol{v}_2. \]

Use the eigenvector decomposition to compute \(\boldsymbol{x}_k\) for \(k=0,\dots,K\) via

\[\boldsymbol{x}_k = c_1 \lambda_1^k \boldsymbol{v}_1 + c_2 \lambda_2^k \boldsymbol{v}_2.\]

So you must determine \(c_1,c_2\) from \(\boldsymbol{x}_0\) and illustrate \(\boldsymbol{x}_k[0]\) and \(\boldsymbol{x}_k[1]\) as a function of \(k\) by completing the code below.

In what follows we want to use the eigenvector decomposition to compute \(\boldsymbol{x}_k\) for \(k=0,\dots,K\) via

\[ \boldsymbol{x}_k = c_1 \lambda_1^k \boldsymbol{v}_1 + c_2 \lambda_2^k \boldsymbol{v}_2. \]

First, this requires that we determine the constants \(c_1\) and \(c_2\) for \(\boldsymbol{x}_0\).

Use that the vector \(\begin{bmatrix}c_1\\c_2\end{bmatrix}\) solves the linear system

\[\begin{split}\begin{bmatrix}\boldsymbol{v}_1&\boldsymbol{v}_2\end{bmatrix}\begin{bmatrix}c_1\\c_2\end{bmatrix}=\boldsymbol{x}_0\end{split}\]

Hint: The solution to a linear system \(A\boldsymbol{x}=\boldsymbol{b}\) can be found with x=np.linalg.solve(A,b).
Hint: Column vectors \(\boldsymbol{x}\) and \(\boldsymbol{y}\) can be assembled into a matrix \(M=\begin{bmatrix}\boldsymbol{x}&\boldsymbol{y}\end{bmatrix}\) using M=np.column_stack([x,y]).

initial_population = np.array([0., 1.])
# Assemble system matrix
V = "INSERT CODE HERE"

# Solve the linear system
c = "INSERT CODE HERE"

c1 = "INSERT CODE HERE"
c2 = "INSERT CODE HERE"

The following cell generates a plot for the number of young and adult rabbits for \(k=0,\ldots,K\).

Complete the cell so that \(\boldsymbol{x}_k\) is computed via the eigenvalue decomposition in each iteration.

A = np.array([[0., 1.],
              [1., 1.]])

K = "INSERT CODE HERE" # choose e.g. 10 (compute k=0..K)

# Compute x_k via formula
punkter = []
for k in range(K+1):
    xk = "INSERT CODE HERE"
    punkter.append(xk)

punkter = np.array(punkter) # shape (K+1, 2)

# Plot components as a function of k
plt.figure(figsize=(8,5))
plt.plot(range(K+1), punkter[:,0], 'o-', label='x[0] (young)')
plt.plot(range(K+1), punkter[:,1], 'o-', label='x[1] (adults)')
plt.xlabel('Month (k)')
plt.ylabel('Number of rabbits')
plt.title('Development of each age class over time')
plt.legend()
plt.grid(True)
plt.show()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[16], line 8
      6 # Compute x_k via formula
      7 punkter = []
----> 8 for k in range(K+1):
      9     xk = "INSERT CODE HERE"
     10     punkter.append(xk)

TypeError: can only concatenate str (not "int") to str

Check that the values you get for \(\boldsymbol{x}_k\) are the same as the ones you determined earlier. (If you use eigenvectors and eigenvalues that you read off from the plot, you probably will not get exactly the same numbers.)

If we divide both sides of the eigenvector decomposition by \(\lambda_1^k\), we get

\[ \frac{\boldsymbol{x}_k}{\lambda_1^k} = c_1 \boldsymbol{v}_1 + c_2 \left(\frac{\lambda_2}{\lambda_1}\right)^k \boldsymbol{v}_2. \]

Since \(|\lambda_2 / \lambda_1| < 1\), the second term will go to \(0\) as \(k\to\infty\), and thus we get

\[ \frac{\boldsymbol{x}_k}{\lambda_1^k} \to c_1 \boldsymbol{v}_1 \quad \text{for }\; k\to \infty. \]

This means that the vector \(\boldsymbol{x}_k\) grows approximately like \(\lambda_1^k\), and the direction of \(\boldsymbol{x}_k\) approaches \(\boldsymbol{v}_1\).

If we define the demographic profile as

\[\boldsymbol{d}_k = \frac{\boldsymbol{x}_k}{\|\boldsymbol{x}_k\|},\]

then show, with paper and pencil, that when \(\boldsymbol{v}_1\) is a unit vector, we have

\[\boldsymbol{d}_k \to \boldsymbol{v}_1 \quad \text{for } k \to \infty.\]

Show numerically that \(\boldsymbol{d}_k\) approaches \(\boldsymbol{v}_1\) for large \(k\) by completing the code below.
Use np.linalg.norm() to find the length of a vector. Hint: If the graph does not go to 0, try comparing with \(-\boldsymbol{v}_1\).

K = 'INSERT CODE HERE'

# Compute d_k = x_k / ||x_k|| for k = 0..K using the eigenvector formula
dk_list = []
for k in range(K+1):
    xk = 'INSERT CODE HERE'
    dk = 'INSERT CODE HERE'
    dk_list.append(dk)
dk_arr = np.array(dk_list)

# Compute distance to reference and plot (log-scale)
ks = np.arange(K+1)
afstand = np.linalg.norm(dk_arr - v1, axis=1)

plt.figure(figsize=(6,4))
plt.semilogy(ks, afstand, 'o-')
plt.xlabel('k')
plt.ylabel(r'$\|d_k - \hat v_1\|$')
plt.title('Distance between $d_k$ and $\hat v_1$')
plt.grid(True)
plt.show()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[17], line 5
      3 # Compute d_k = x_k / ||x_k|| for k = 0..K using the eigenvector formula
      4 dk_list = []
----> 5 for k in range(K+1):
      6     xk = 'INSERT CODE HERE'
      7     dk = 'INSERT CODE HERE'

TypeError: can only concatenate str (not "int") to str

2. Getting real - birth and survival#

Up to now we have worked with the Fibonacci matrix, where all elements were \(1\) and the population followed the Fibonacci sequence. The elements in the matrix can be interpreted as survival rates and birth rates. We now generalize the model by introducing:

\(f_{2}\): birth rate - expected number of young per adult per month (probability of offspring),
\(s_{1}\): survival rate from young to adult (probability that a young survives to become an adult next month),
\(s_{2}\): survival rate for adults (probability that an adult survives to next month).

The new transition matrix becomes:

\[\begin{split} A = \begin{bmatrix} 0 & f_{2} \\ s_{1} & s_{2} \end{bmatrix}. \end{split}\]

Show, by looking at the characteristic equation \(\det(A-\lambda I) = 0\) with paper and pencil, that if \(f_2, s_1, s_2\) are all positive, then the matrix has two real eigenvalues, \(\lambda_1>0\) and \(\lambda_2<0\), and that \(\lambda_1 > |\lambda_2|.\)

In the following task we proceed in the same way as in the first task, where we now investigate a different transition matrix. You can reuse most of the code we used in the first task.

The task therefore is the following:

Choose some values for \(f_{2} \in (0.2, 0,7), s_{1}, s_{2} \in (0.7,1)\) and build \(A\). Consider what good biological values could be (you may use a chatbot to help you). Remember to define \(A\) in a code cell.

Choose an initial population. Remember to define it in a code cell.

Use unit_vector_mapping(A) or np.linalg.eig(A) to find the eigenvectors \(\boldsymbol{v}_1\) and \(\boldsymbol{v}_2\) and eigenvalues \(\lambda_1\) and \(\lambda_2\) as before.

Use the eigenvector decomposition to determine \(x_k\) for \(k=0,\dots,K\) (use the formula \(\boldsymbol{x}_k = c_1\lambda_1^k \boldsymbol{v}_1 + c_2\lambda_2^k \boldsymbol{v}_2\)) and plot the graphs for both \(\boldsymbol{x}_k[0]\) and \(\boldsymbol{x}_k[1]\).

Show numerically that \(\boldsymbol{d}_k\) still goes to \(\boldsymbol{v}_1\) for large \(k\).

'INSERT CODE HERE'

'INSERT CODE HERE'

When you have the above working, you can try different values - both for the transition matrix and for the initial population.

More age classes#

It turns out that a rabbit population can advantageously be divided into three age classes according to the same principle.
The division is:

\(\boldsymbol{x}_k[0]\): number of rabbit kits in month \(k\)
\(\boldsymbol{x}_k[1]\): number of young adults in month \(k\)
\(\boldsymbol{x}_k[2]\): number of adults in month \(k\)

This is because young adults have a relatively low birth rate (about half of adults) and the survival rate in nature from young adult to adult is relatively low. Nature is harsh; there is a large loss.

The transitions are modeled via the transition matrix

\[\begin{split} A = \begin{bmatrix} 0 & f_2 & f_3\\[4pt] s_1 & 0 & 0\\[4pt] 0 & s_2 & s_3 \end{bmatrix}, \end{split}\]

where

\(f_2\) and \(f_3\) are birth rates: expected number of young per individual in age classes 2 and 3 (monthly), respectively.
\(s_1\) is the survival rate from young to young-adults (class \(0 \rightarrow 1\)).
\(s_2\) is the survival rate from young-adults to adults (class \(1 \rightarrow 2\)).
\(s_3\) is the survival rate for adults to adults (class \(2 \rightarrow 2\)).

Choose values for \(f_2,f_3,s_1,s_2,s_3\) and for the initial population and define them in the code cell below. Again, find biologically reasonable values (e.g. using generative AI).

f2 = 'INSERT CODE HERE'
f3 = 'INSERT CODE HERE'
s1 = 'INSERT CODE HERE'
s2 = 'INSERT CODE HERE'
s3 = 'INSERT CODE HERE'
initial_population = 'INSERT CODE HERE'

It turns out that such a matrix, as we have now defined, always has a positive eigenvalue that is dominant. And the associated eigenvector can be chosen such that the entries are non-negative. This is a variant of the Perron-Frobenius theorem.

Define the matrix \(A\) and use lamb, V = np.linalg.eig(A) to find the largest and dominant eigenvalue (lamb) and the associated eigenvectors (V).

A = 'INSERT CODE HERE'

lamb, V = 'INSERT CODE HERE'

# Sort so the dominant eigenvalue comes first
ind = np.argsort(np.abs(lamb))[::-1]
lamb = lamb[ind]
V = V[:, ind]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[20], line 3
      1 A = 'INSERT CODE HERE'
----> 3 lamb, V = 'INSERT CODE HERE'
      5 # Sort so the dominant eigenvalue comes first
      6 ind = np.argsort(np.abs(lamb))[::-1]

ValueError: too many values to unpack (expected 2)

As before, we can use the eigenvalue decomposition on \(\boldsymbol{x}_0\), so

\[ \boldsymbol{x}_0 = c_1 \boldsymbol{v}_1 + c_2 \boldsymbol{v}_2 + c_3 \boldsymbol{v}_3. \]

Define \(\boldsymbol{v}_1\), \(\boldsymbol{v}_2\) and \(\boldsymbol{v}_3\) in the code cell below.

v1 = 'INSERT CODE HERE'
v2 = 'INSERT CODE HERE'
v3 = 'INSERT CODE HERE'

Determine the coefficients \(c_1,c_2,c_3\) in the eigenvalue decomposition of \(\boldsymbol{x}_0\) with np.linalg.solve(V, initial_population).

# INSERT CODE HERE

Use the eigenvector decomposition to determine \(\boldsymbol{x}_k\) for \(k=0,\dots,K\).

# INSERT CODE HERE

Investigate the demographic profile \(\boldsymbol{d}_k\).

# INSERT CODE HERE

Experiment with the choice of the rates and try to answer the following:
How should we regulate the rates so that we have a stable population?
Hint: Consider the cases \(\lambda_1 <1,\) \(\lambda_1 =1\) or \(\lambda_1 >1\).

3. Competing species#

We now introduce a new competitor - another rabbit species - released into the same area. The species will mix, interbreed, and have offspring. We want to investigate how the populations develop as time goes on. We still look at females, with two classes (young and adults) and describe the original species as before by \(\boldsymbol{x}_k = \begin{bmatrix}\boldsymbol{x}_k[0]\\ \boldsymbol{x}_k[1]\end{bmatrix}\) and the new species by \(\boldsymbol{y}_k = \begin{bmatrix}\boldsymbol{y}_k[0]\\ \boldsymbol{y}_k[1]\end{bmatrix}\).

The two species mix and produce offspring across species. We model according to the principle that offspring for which at least one parent is of the new species belong to the new species.

This gives an extended dynamics:

\[\begin{split} \boldsymbol{z}_{k+1} = \begin{bmatrix} \boldsymbol{x}_{k+1}[0] \\ \boldsymbol{x}_{k+1}[1] \\ \boldsymbol{y}_{k+1}[0] \\ \boldsymbol{y}_{k+1}[1] \end{bmatrix} = \begin{bmatrix} 0 & a_{12} & 0 & 0 \\ a_{21} & a_{22} & 0 & 0 \\ 0 & a_{32} & 0 & a_{34} \\ 0 & 0 & a_{43} & a_{44} \end{bmatrix} \begin{bmatrix} \boldsymbol{x}_k[0] \\ \boldsymbol{x}_k[1] \\ \boldsymbol{y}_k[0] \\ \boldsymbol{y}_k[1] \end{bmatrix} = A \boldsymbol{z}_k \end{split}\]

Selection and if-elif-else statements in Python#

Before we look at the population development for the two species, we will focus on a control structure called selection.

It is a control structure that is absolutely central when programming. Basically, it is about placing a condition in the program so that if the condition is satisfied, we perform one action, and if the condition is not satisfied, we perform another action (or no action).

The basic form is:

if first_condition:
    # run this if the first condition is true
elif second_condition:
    # otherwise if the second condition is true
else:
    # otherwise run this

You can omit both elif and else as needed.

Complete the code cell below which, based on a number x, does the following:

If x is greater than 0, print "The number is positive".

If x is less than 0, print "The number is negative".

Otherwise (if x equals 0), print "The number is zero"

x  = np.random.standard_normal() # We choose a random number from a standard normal distribution

if x>0:
    print("The number is positive , x = ",x)
elif "INSERT CODE HERE":
    print("The number is negative, x = ",x)
else:
    print("The number is zero")

The number is positive , x =  0.6426333367642123

Test that the code cell works for positive x, negative x, and if x is 0.

A biological principle is called the “Competitive Exclusion Principle”. It states that two species that compete and reproduce based on the same resources cannot coexist. One of the species will take over and the other will die out. This is one of the reasons species develop niche separation through evolution.

We will see this principle unfold for our two rabbit populations.

We are now ready to investigate the development of the two competing species’ populations. In the code cell below values have been chosen for \(a_{12},a_{21},a_{22},a_{32},a_{34},a_{43},a_{44}\) and for an initial population for \(z_0\). The two populations develop as before, but the new thing is that part of the offspring from the original species belong to the new species.

# Rates
a12 = 0.4    # x-adults -> x-young
a21 = 0.55   # x-young -> x-adults
a22 = 0.6    # x-adults -> x-adults

a32 = 0.2    # x-adults -> y-young
a34 = 0.4    # y-adults -> y-young
a43 = 0.6    # y-young -> y-adults
a44 = 0.6   # y-adults -> y-adults

# Initial population
initial_population = np.array([2000, 3000, 1, 2])

In the previous tasks we used eigenvalue decomposition for iteration. In the rest of the tasks we will again look at the simple matrix iteration \(\boldsymbol{z}_{k+1}=A\boldsymbol{z}_k\)

Implement the transition matrix \(A\) in the code cell below.

A = 'INSERT CODE HERE'

Complete the code cell below so that the following is done in each iteration for \(k=0, \dots, K\):

compute \(\boldsymbol{z}_{k+1}\).

compute total population for \(\boldsymbol{x}\), \(\boldsymbol{y}\) and the combined total population (both \(\boldsymbol{x}\) and \(\boldsymbol{y}\)).

use an if statement to print the years where species \(\boldsymbol{y}\) makes up more than half of the population.

Note: In the code we define a matrix \(Z\) so that the \(k\)th row is \((Z)_k=\boldsymbol{z}_k\). In Python we then get \(\boldsymbol{z}_k\) as Z[k] and the elements \(\boldsymbol{z}_k[i]\) as Z[k,i] for \(i\in\{0,1,2,3\}\).

# Iterate over K years
K = 'INSERT CODE HERE'
Z = np.zeros((K+1, 4))
Z[0] = initial_population

for k in range(K):
    Z[k+1] = 'INSERT CODE HERE'

    # Compute totals
    total_x = Z['INSERT CODE HERE', 'INSERT CODE HERE'] + Z['INSERT CODE HERE', 'INSERT CODE HERE'] # x_{k+1}[0] + x_{k+1}[1]
    total_y = Z['INSERT CODE HERE', 'INSERT CODE HERE'] + Z['INSERT CODE HERE', 'INSERT CODE HERE'] # y_{k+1}[0] + y_{k+1}[1]
    total = 'INSERT CODE HERE' # x_{k+1}[0] + x_{k+1}[1] + y_{k+1}[0] + y_{k+1}[1]

    # Check whether species y makes up more than half
    if 'INSERT CODE HERE':
        print(f"Month {k+1}: Species Y makes up >50% (share = {total_y/total:.2f})")
        #break
    else:
        print(f"Month {k+1}: Species Y <50% (share = {total_y/total:.2f})")

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[28], line 3
      1 # Iterate over K years
      2 K = 'INSERT CODE HERE'
----> 3 Z = np.zeros((K+1, 4))
      4 Z[0] = initial_population
      6 for k in range(K):

TypeError: can only concatenate str (not "int") to str

In the if statement there is the following line:

# break

Try removing # and run the code cell again.

Explain, based on this, what a break statement does and use this to find the smallest year where \(y\) makes up more than half of the total population.

You can use the function below to see the development of the two species’ populations. If you removed # in the code cell above from the break statement, you must add it back.

def plot_totalpopulation(Z):
    year = np.arange(Z.shape[0])
    total_x = Z[:,0] + Z[:,1]
    total_y = Z[:,2] + Z[:,3]

    fig, ax = plt.subplots(1, 2, figsize=(12,5), sharey=True)

    # Species X
    ax[0].semilogy(year, total_x, 'o-', color='tab:blue')
    ax[0].set_xlabel('Months')
    ax[0].set_ylabel('Number of individuals')
    ax[0].set_title('Species X (total population)')
    ax[0].grid(True)

    # Species Y
    ax[1].plot(year, total_y, 'o-', color='tab:orange')
    ax[1].set_xlabel('Months')
    ax[1].set_title('Species Y (total population)')
    ax[1].grid(True)

    plt.tight_layout()
    plt.show()


plot_totalpopulation(Z)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[29], line 25
     21     plt.tight_layout()
     22     plt.show()
---> 25 plot_totalpopulation(Z)

NameError: name 'Z' is not defined

About how many months pass before species \(\boldsymbol{x}\) makes up less than 10% of the total population?

In the following code cell we define a function dominating_eigenvalue that returns the dominant eigenvalue and the corresponding eigenvector for a given matrix \(A\).

Try to predict the demographic profile using the dominant eigenvalue and eigenvector.
Will the original species always die out?

def dominating_eigenvalue(A):
    """
    Computes the dominant eigenvalue and corresponding eigenvector for matrix A.
    """
    eigenvalues, eigenvectors = np.linalg.eig(A)

    # Find index of the largest eigenvalue (absolute value)
    idx = np.argmax(np.abs(eigenvalues))

    # Largest eigenvalue pair
    largest_eigenvalue = eigenvalues[idx]
    largest_eigenvector = eigenvectors[:, idx]

    # Normalize to norm 1 and positive sum
    largest_eigenvector = largest_eigenvector / np.sum(largest_eigenvector)
    largest_eigenvector = largest_eigenvector / np.linalg.norm(largest_eigenvector)

    # Take real part
    largest_eigenvalue = np.real(largest_eigenvalue)
    largest_eigenvector = np.real(largest_eigenvector)

    return largest_eigenvalue, largest_eigenvector

# Print results
largest_eigenvalue, largest_eigenvector = dominating_eigenvalue(A)
print("Largest eigenvalue:", largest_eigenvalue)
print("Corresponding eigenvector:", largest_eigenvector)

---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
Cell In[30], line 25
     22     return largest_eigenvalue, largest_eigenvector
     24 # Print results
---> 25 largest_eigenvalue, largest_eigenvector = dominating_eigenvalue(A)
     26 print("Largest eigenvalue:", largest_eigenvalue)
     27 print("Corresponding eigenvector:", largest_eigenvector)

Cell In[30], line 5, in dominating_eigenvalue(A)
      1 def dominating_eigenvalue(A):
      2     """
      3     Computes the dominant eigenvalue and corresponding eigenvector for matrix A.
      4     """
----> 5     eigenvalues, eigenvectors = np.linalg.eig(A)
      7     # Find index of the largest eigenvalue (absolute value)
      8     idx = np.argmax(np.abs(eigenvalues))

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/numpy/linalg/linalg.py:1327, in eig(a)
   1195 """
   1196 Compute the eigenvalues and right eigenvectors of a square array.
   1197 
   (...)   1324 
   1325 """
   1326 a, wrap = _makearray(a)
-> 1327 _assert_stacked_2d(a)
   1328 _assert_stacked_square(a)
   1329 _assert_finite(a)

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/numpy/linalg/linalg.py:206, in _assert_stacked_2d(*arrays)
    204 for a in arrays:
    205     if a.ndim < 2:
--> 206         raise LinAlgError('%d-dimensional array given. Array must be '
    207                 'at least two-dimensional' % a.ndim)

LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

4. The Danish population#

We now extend our age-structured models to a larger population. We choose to consider the entire Danish population. The population is described by a vector \(\boldsymbol{x} \in \mathbb{R}^{100}\), where each element \(\boldsymbol{x}[i]\) corresponds to the number of people of age \(i\). Thus we only consider people aged 0-99 years, but let \(\boldsymbol{x}[99]\) describe ages \(\geq 99\) years. The dynamics

\[ \boldsymbol{x}_{k+1} = A \boldsymbol{x}_k + \boldsymbol{b} \]

are controlled by a Leslie matrix \(A \in \mathbb{R}^{100 \times 100}\).

The transition matrix \(A\) (which, as mentioned, is a Leslie matrix) has the following form:

The first row of \(A\) contains birth rates \(f_{15}, \dots, f_{51} > 0\), i.e., age classes where people typically have children.
The subdiagonal contains survival rates \(s_i > 0\), which indicate the probability that people in age class \(i\) survive to age class \(i+1\).
All other elements of \(A\) are zero.

That is, \(A\) has the form

\[\begin{split} A = \begin{bmatrix} f_1 & f_2 & f_3 & \cdots & f_n \\ s_1 & 0 & 0 & \cdots & 0 \\ 0 & s_2 & 0 & \cdots & 0 \\ \vdots & & \ddots & \ddots & \vdots \\ 0 & 0 & \cdots & s_{n-1} & 0 \end{bmatrix}. \end{split}\]

\(\boldsymbol{b}\) can be used to add migration or other external inflows.

Population data#

In the csv file ‘population_data.csv’ we have collected population data for the Danish population in 2024. All numbers are from Statistics Denmark. In the following code cell the file is imported, and the top rows of the dataset are printed.

Run the cell and see what the dataset contains.

df = pd.read_csv('Data/population_data.csv')
print("Columns in the data file:\n",df.head())

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[31], line 1
----> 1 df = pd.read_csv('Data/population_data.csv')
print("Columns in the data file:\n",df.head())

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:873, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, skip_blank_lines, parse_dates, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, low_memory, memory_map, float_precision, storage_options, dtype_backend)
kwds_defaults = _refine_defaults_read(
   dialect,
   delimiter,
   (...)    869     dtype_backend=dtype_backend,
)
kwds.update(kwds_defaults)
--> 873 return _read(filepath_or_buffer, kwds)

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:300, in _read(filepath_or_buffer, kwds)
_validate_names(kwds.get("names", None))
# Create the parser.
--> 300 parser = TextFileReader(filepath_or_buffer, **kwds)
if chunksize or iterator:
   return parser

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1645, in TextFileReader.__init__(self, f, engine, **kwds)
   self.options["has_index_names"] = kwds["has_index_names"]
self.handles: IOHandles | None = None
-> 1645 self._engine = self._make_engine(f, self.engine)

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1904, in TextFileReader._make_engine(self, f, engine)
   if "b" not in mode:
       mode += "b"
-> 1904 self.handles = get_handle(
   f,
   mode,
   encoding=self.options.get("encoding", None),
   compression=self.options.get("compression", None),
   memory_map=self.options.get("memory_map", False),
   is_text=is_text,
   errors=self.options.get("encoding_errors", "strict"),
   storage_options=self.options.get("storage_options", None),
)
assert self.handles is not None
f = self.handles.handle

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/common.py:926, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
elif isinstance(handle, str):
   # Check whether the filename is to be opened in binary mode.
   # Binary mode does not support 'encoding' and 'newline'.
   if ioargs.encoding and "b" not in ioargs.mode:
       # Encoding
--> 926         handle = open(
           handle,
           ioargs.mode,
           encoding=ioargs.encoding,
           errors=errors,
           newline="",
       )
   else:
       # Binary mode
       handle = open(handle, ioargs.mode)

FileNotFoundError: [Errno 2] No such file or directory: 'Data/population_data.csv'

Initially we are interested in the column Population, which contains data for the age distribution of the Danish population. This is the data we will use as our initial population \(\boldsymbol{x}_0\).

We can define a column as an \(\mathbb{R}^{100}\) vector in a NumPy array.

# Convert to NumPy arrays
age = df["Age"].to_numpy()
population = df['Population'].to_numpy()

print("Size of the dataset:\n", len(age))

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[32], line 2
      1 # Convert to NumPy arrays
----> 2 age = df["Age"].to_numpy()
      3 population = df['Population'].to_numpy()
      5 print("Size of the dataset:\n", len(age))

NameError: name 'df' is not defined

The following cell generates a plot of the age distribution in Denmark.

# Plot initial population
plt.figure(figsize=(8,4))
plt.bar(age, population)
plt.xlabel("Age")
plt.ylabel("Number of people")
plt.title("Age distribution in the initial population (0-99 years)")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[33], line 3
      1 # Plot initial population
      2 plt.figure(figsize=(8,4))
----> 3 plt.bar(age, population)
      4 plt.xlabel("Age")
      5 plt.ylabel("Number of people")

NameError: name 'age' is not defined

<Figure size 800x400 with 0 Axes>

Choosing your own dataset#

This section is an optional exercise if you want to learn how to import a dataset into Python. To complete it, you’ll need to run the code locally.

What would the population model look like if you start from the age distribution of a Danish municipality of your choice as the initial population \(\boldsymbol{x}_0\)? The task is as follows:

Open the following link to Statistics Denmark.

Find a home municipality under the category OMRÅDE (“REGION”).

Select Markér alle (“Select all”) under the category ALDER (“AGE”).

Click VIS TABEL (“SHOW TABLE”).

Switch from Excel (*.xlsx) to Matrix (*.csv) and click the blue arrow right next to it to download the CSV file.

Place the CSV file in the same directory as this Jupyter notebook.

Change INSERT CODE HERE to the filename of the CSV file in the code cell below.

# Load CSV
df = pd.read_csv("INSERT CODE HERE", sep=";", quotechar='"', encoding="latin1", header=None)

# Get initial population for ages 0-99
# Column 0: age, column 1: number of people
# Skip the first row "Alder i alt" ("Age total")
initial_population = df.iloc[1:101, 1].astype(int).to_numpy()  # 0-99 years

# Plot initial population
plt.figure(figsize=(10,5))
plt.bar(age, initial_population)
plt.xlabel("Age")
plt.ylabel("Number of people")
plt.title("Age distribution in the initial population (0-99 years)")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[34], line 2
# Load CSV
----> 2 df = pd.read_csv("INSERT CODE HERE", sep=";", quotechar='"', encoding="latin1", header=None)
# Get initial population for ages 0-99
# Column 0: age, column 1: number of people
# Skip the first row "Alder i alt" ("Age total")
initial_population = df.iloc[1:101, 1].astype(int).to_numpy()  # 0-99 years

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:873, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, skip_blank_lines, parse_dates, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, low_memory, memory_map, float_precision, storage_options, dtype_backend)
kwds_defaults = _refine_defaults_read(
   dialect,
   delimiter,
   (...)    869     dtype_backend=dtype_backend,
)
kwds.update(kwds_defaults)
--> 873 return _read(filepath_or_buffer, kwds)

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:300, in _read(filepath_or_buffer, kwds)
_validate_names(kwds.get("names", None))
# Create the parser.
--> 300 parser = TextFileReader(filepath_or_buffer, **kwds)
if chunksize or iterator:
   return parser

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1645, in TextFileReader.__init__(self, f, engine, **kwds)
   self.options["has_index_names"] = kwds["has_index_names"]
self.handles: IOHandles | None = None
-> 1645 self._engine = self._make_engine(f, self.engine)

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/parsers/readers.py:1904, in TextFileReader._make_engine(self, f, engine)
   if "b" not in mode:
       mode += "b"
-> 1904 self.handles = get_handle(
   f,
   mode,
   encoding=self.options.get("encoding", None),
   compression=self.options.get("compression", None),
   memory_map=self.options.get("memory_map", False),
   is_text=is_text,
   errors=self.options.get("encoding_errors", "strict"),
   storage_options=self.options.get("storage_options", None),
)
assert self.handles is not None
f = self.handles.handle

File /builds/ctm/ctmweb/venv/lib/python3.11/site-packages/pandas/io/common.py:926, in get_handle(path_or_buf, mode, encoding, compression, memory_map, is_text, errors, storage_options)
elif isinstance(handle, str):
   # Check whether the filename is to be opened in binary mode.
   # Binary mode does not support 'encoding' and 'newline'.
   if ioargs.encoding and "b" not in ioargs.mode:
       # Encoding
--> 926         handle = open(
           handle,
           ioargs.mode,
           encoding=ioargs.encoding,
           errors=errors,
           newline="",
       )
   else:
       # Binary mode
       handle = open(handle, ioargs.mode)

FileNotFoundError: [Errno 2] No such file or directory: 'INSERT CODE HERE'

The Leslie matrix: A simple model#

We want to investigate how the population develops over time using the Leslie matrix. We start with a simplified model where birth rates and survival rates are assumed to be uniform.

Define the Leslie matrix. Choose values yourself for:

Uniform birth rate for age classes 15-51 years.

Uniform survival rate for all age classes.

# Define Leslie matrix
k = 100
A = np.zeros((k, k))

# Birth rates
for i in range(15, 52):
    A[0, i] = 0.05 #'INSERT CODE HERE'

# Survival rates
for i in range(k-1):
    A[i+1, i] = 0.95 #'INSERT CODE HERE'

In the following code cell a function population_over_time is defined, which simulates a population’s development over time given a transition matrix and an initial population.

The function also generates two figures:

total population over time,
the age distribution after \(K\) years.

Complete the function.

def population_over_time(A, initial_population, K, b = None, plot=True):
    """
    Computes population over K years given transition matrix A, initial population, and migration vector b.
    Returns a matrix with the population for each age class over time.
    """
    k = A.shape[0]
    X = np.zeros((K+1, k))
    X[0] = initial_population

    # set b to the zero vector if not given
    if b is None:
        b = np.zeros(k)

    # iteration step
    for year in range(K):
        X[year+1] = 'INSERT CODE HERE'

    # generate plots if desired
    if plot:
        # plot total population over time
        total_population = X.sum(axis=1)
        plt.figure(figsize=(8,4))
        plt.plot(range(K+1), total_population, 'o-')
        plt.xlabel('Year')
        plt.ylabel('Total population')
        plt.title('Development of the Danish population')
        plt.grid(True)
        plt.show()

        # plot age distribution after K years
        plt.figure(figsize=(8,4))
        plt.bar(range(k), X[K])
        plt.xlabel('Age')
        plt.ylabel('Number of people')
        plt.title(f'Age distribution after {K} years')
        plt.show()

    return X

Run the function for your matrix \(A\) and look at the plots. Choose the number of years \(K\) yourself.

_ = population_over_time(A, population, K=100)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[37], line 1
----> 1 _ = population_over_time(A, population, K=100)

NameError: name 'population' is not defined

What is the long-run development of the total population? Increasing/decreasing/stable?

To discuss the long-run development of the population we introduce the concept of the net reproduction rate.

Net reproduction rate#

The net reproduction rate (NRR) is the average number of offspring that a newborn individual is expected to produce over its lifetime. In practice one usually models only females, since they determine the effective reproduction in the population. Recall the form of a Leslie matrix

\[\begin{split} A = \begin{bmatrix} f_1 & f_2 & f_3 & \cdots & f_n \\ s_1 & 0 & 0 & \cdots & 0 \\ 0 & s_2 & 0 & \cdots & 0 \\ \vdots & & \ddots & \ddots & \vdots \\ 0 & 0 & \cdots & s_{n-1} & 0 \end{bmatrix}. \end{split}\]

For an \(n\times n\) Leslie matrix, the net reproduction rate \(R_0\) is computed as

\[ R_0 = \sum_{i=1}^{n} f_i \, l_i , \]

where \(f_i\) is the fertility rate for age class \(i\), and \(l_i\) is the probability of surviving to the beginning of age class \(i\). The survival probability \(l_i\) is defined recursively as

\[ l_1 = 1, \qquad l_2 = s_1, \qquad l_3 = s_1 s_2, \qquad \dots, \qquad l_i = \prod_{j=1}^{i-1} s_j , \]

where \(s_j\) is the probability of surviving from age class \(j\) to \(j+1\). Thus \(R_0\) can also be written explicitly as

\[ R_0 = f_1 + f_2 s_1 + f_3 s_1 s_2 + \dots + f_n s_1 s_2 \cdots s_{n-1} =\sum_{i=1}^n f_i\left( \prod_{j=1}^{i-1}s_j \right). \]

In the following code cell we define a function that computes the net reproduction rate for a given Leslie matrix.

Complete the function.

def NRR(A):
    """
    Computes the net reproduction rate R0 for Leslie matrix A.
    """
    n = A.shape[0]
    R0 = 0.0

    # Compute R0
    for i in range(n):
        # birth rate for age class i
        fi = "INSERT CODE HERE"

        # survival from birth to age class i
        li = 1.0
        for j in range(i):
            # survival from age class j to j+1
            sj = "INSERT CODE HERE"
            li *= sj
        R0 += fi * li

    return R0

The interpretation of \(R_0\) is that

if \(R_0>1\) the population is expected to grow,
if \(R_0=1\) the population is stable,
if \(R_0<1\) the population is expected to decline.

Use the function to examine the net reproduction rate. Does the interpretation match what you could see in the plots?

"INSERT CODE HERE"

'INSERT CODE HERE'

We have the same interpretations for the dominant eigenvalue and whether it is above/below 1.

Examine the dominant eigenvalue using dominating_eigenvalue as defined earlier.

Do the interpretations match each other? Is \(R_0=\lambda_{max}\)?

"INSERT CODE HERE"

'INSERT CODE HERE'

Try going back to where the Leslie matrix was defined. Can you find values for birth or survival rates where the net reproduction rate is close to 1? Show the plots again and look at the long-run development.

Extension of the Leslie model#

To better reflect reality we will now experiment with a non-uniform birth and survival rate that varies with age.

The first extension is that instead of choosing a constant rate for all fertile age classes, we let the rate follow a parabola that peaks at age 33 and is zero at ages 14 and 52.

Use this parabola to model the birth rate and update the Leslie matrix in the code below. Let f_max be the value of the birth rate at the vertex. Continue to use a uniform survival rate.

# Define Leslie matrix
k = 100
A_new = np.zeros((k, k))

# vertex of birth rates
f_max = "INSERT CODE HERE"

# Birth rates
for i in range(15, 52):
    A_new[0, i] = 'INSERT CODE HERE'

# Survival rates
for i in range(k-1):
    A_new[i+1, i] = 'INSERT CODE HERE'

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[41], line 10
      8 # Birth rates
      9 for i in range(15, 52):
---> 10     A_new[0, i] = 'INSERT CODE HERE'
     12 # Survival rates
     13 for i in range(k-1):

ValueError: could not convert string to float: 'INSERT CODE HERE'

Investigate the population development using population_over_time and the net reproduction rate.

R0 = NRR(A_new)
print("Net reproduction rate R0 for extended model:", R0)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[42], line 1
----> 1 R0 = NRR(A_new)
      2 print("Net reproduction rate R0 for extended model:", R0)

Cell In[38], line 19, in NRR(A)
     17         sj = "INSERT CODE HERE"
     18         li *= sj
---> 19     R0 += fi * li
     21 return R0

TypeError: can't multiply sequence by non-int of type 'float'

_ = population_over_time(A_new, population, K=100)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[43], line 1
----> 1 _ = population_over_time(A_new, population, K=100)

NameError: name 'population' is not defined

Does the new birth rate make a difference compared to the figures from the simple model?

We will now extend our model further: Let the survival rate be a linear function with the highest rate at age 0, decreasing toward older age classes.

Introduce the new survival rate above where A_new is defined. Choose and experiment with the survival rate at age 0 and age 99.

Try to find values for f_max and the survival rate for ages 0 and 99 such that the Leslie matrix has net reproduction rate \(R_0\approx 1\).
Compare through plots with the simple model with \(R_0\approx 1\).

Data-driven population modeling#

We now want to use our dataset to create a Leslie matrix that reflects the Danish population as well as possible. In the following cell we create NumPy arrays for the remaining columns in the dataset.

deaths = df['Deaths'].to_numpy()
fertility = df['Fertility'].to_numpy()
immigration = df['Immigration'].to_numpy()
emigration = df['Emigration'].to_numpy()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[44], line 1
----> 1 deaths = df['Deaths'].to_numpy()
      2 fertility = df['Fertility'].to_numpy()
      3 immigration = df['Immigration'].to_numpy()

NameError: name 'df' is not defined

The columns describe the following for each age class:

Deaths: number of deaths.
Fertility: births per 1000 women.
Immigration: number of people immigrating.
Emigration: number of people emigrating.

In the Leslie matrix we want the birth rate per adult. The survival rate for the Leslie matrix can be found from the deaths and the population numbers.

Complete the code in the following cell so that we obtain birth and survival rates in the desired form.
Hint: Assume that half the population is women.

# fertility is per 1000 women, must be converted to per person
birth_rate = "INSERT CODE HERE"

# survival rate computed from population and deaths
survival_rate = "INSERT CODE HERE"

In the following cells the birth rate, survival rate, and population counts are shown.

Run the cell.

# Birth rate plot
plt.figure(figsize=(6,3))
plt.plot(age, birth_rate, 'o-')
plt.title('Birth rate')
plt.ylabel('Rate')
plt.xlabel('Age')
plt.show()

# Survival rate plot
plt.figure(figsize=(6,3))
plt.plot(age, survival_rate, 'o-')
plt.title('Survival rate')
plt.xlabel('Age')
plt.ylabel('Rate')
plt.show()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[46], line 3
      1 # Birth rate plot
      2 plt.figure(figsize=(6,3))
----> 3 plt.plot(age, birth_rate, 'o-')
      4 plt.title('Birth rate')
      5 plt.ylabel('Rate')

NameError: name 'age' is not defined

<Figure size 600x300 with 0 Axes>

Does it look as one would expect?

How well does it match the rates we used in the previous Leslie matrices?

We can now assemble the Leslie matrix \(A\).

k = 100
A = np.zeros((k, k))
for i in range(k):
    A[0, i] = birth_rate[i]
for i in range(k-1):
    A[i+1, i] = survival_rate[i]
A[-1,-1] = survival_rate[-1]  # The last age class is self-absorbing

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[47], line 4
      2 A = np.zeros((k, k))
      3 for i in range(k):
----> 4     A[0, i] = birth_rate[i]
      5 for i in range(k-1):
      6     A[i+1, i] = survival_rate[i]

ValueError: could not convert string to float: 'I'

Again, investigate the population development using population_over_time and NRR.

"INSERT CODE HERE"

'INSERT CODE HERE'

Is the population stable, growing, or declining?

How has the age distribution changed compared to the initial population?

Now imagine that we can run a campaign to get Danes to have more/fewer children. Can we achieve a net reproduction rate close to 1?

In the following cell we define a new Leslie matrix A_new, where the birth rates are multiplied by a factor f_factor. Can you find a value such that \(R_0\approx 1\)?

f_factor = 'INSERT CODE HERE' 

A_new = A.copy()
A_new[0] = f_factor * A_new[0] 

R0 = NRR(A_new)
print("Net reproduction rate R0 with changed fertility:", R0)

---------------------------------------------------------------------------
UFuncTypeError                            Traceback (most recent call last)
Cell In[49], line 4
      1 f_factor = 'INSERT CODE HERE' 
      3 A_new = A.copy()
----> 4 A_new[0] = f_factor * A_new[0] 
      6 R0 = NRR(A_new)
      7 print("Net reproduction rate R0 with changed fertility:", R0)

UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U16'), dtype('float64')) -> None

Show the plots again for the new Leslie matrix and look at the long-run development.

_ = population_over_time(A_new, population, K=1000)

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[50], line 1
----> 1 _ = population_over_time(A_new, population, K=1000)

NameError: name 'population' is not defined

Migration#

Instead of changing birth and survival rates in the Leslie matrix, the population’s development can also be affected by external factors, i.e. migration. So far we have not taken into account the migration term \(\boldsymbol{b}\) in

\[ \boldsymbol{x}_{k+1}=A\boldsymbol{x}_k+\boldsymbol{b}. \]

In the following cell, define the vector \(\boldsymbol{b}\) from the columns immigration and emigration and show the migration data in the given plot.

b = immigration-emigration #"INSERT CODE HERE"

plt.figure()
plt.plot(immigration, label='Immigration')
plt.plot(emigration, label='Emigration')
plt.plot(b, label='b')
plt.title('Migration')
plt.xlabel('Age')
plt.ylabel('Number of people')
plt.legend()
plt.show()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[51], line 1
----> 1 b = immigration-emigration #"INSERT CODE HERE"
      3 plt.figure()
      4 plt.plot(immigration, label='Immigration')

NameError: name 'immigration' is not defined

If there exists an equilibrium \(\boldsymbol{x}_{lim}\) such that \(\boldsymbol{x}_k\to\boldsymbol{x}_{lim}\) for \(k\to\infty\), then

\[ \boldsymbol{x}_{lim}=A\boldsymbol{x}_{lim}+\boldsymbol{b}\implies(I-A)\boldsymbol{x}_{lim}=\boldsymbol{b}. \]

So we can find \(\boldsymbol{x}_{lim}\) as the solution to a linear system.

Solve the linear system for \(\boldsymbol{x}_{lim}\).
Hint: \(A\boldsymbol{x}=\boldsymbol{b}\) can be solved with np.linalg.solve(A,b).
Hint: The \(n\times n\) identity matrix is created with np.eye(n).

xlim = "INSERT CODE HERE" 

Show the equilibrium population in the following plot.

plt.figure(figsize=(8,4))
plt.bar(age, xlim)
plt.xlabel("Age")
plt.ylabel("Number of people")
plt.title("Age distribution in the initial population (0-99 years)")
plt.grid(axis="y", linestyle="--", alpha=0.7)
plt.show()

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[53], line 2
      1 plt.figure(figsize=(8,4))
----> 2 plt.bar(age, xlim)
      3 plt.xlabel("Age")
      4 plt.ylabel("Number of people")

NameError: name 'age' is not defined

<Figure size 800x400 with 0 Axes>

Can you verify this in simulations? Show the population development with population_over_time(A, initial_population, K, b) and give the migration vector as input b.

"INSERT CODE HERE"

'INSERT CODE HERE'

Do we reach equilibrium?

Compare with the corresponding plot without the migration term.

Computational complexity of matrix iteration#

and computed the population over time by iteratively applying

\[ \boldsymbol{x}_{k+1} = A \boldsymbol{x}_k = A (A (... (A \boldsymbol{x}_0) ...)). \]

So we have not computed \(A^{k+1} \boldsymbol{x}_0\) directly, but repeated the matrix-vector multiplication \(k+1\) times.

In this task you should estimate how many basic computations (multiplications and additions) are required for a \(100 \times 100\) Leslie matrix.

Use \(k = 50\)
Dimension: \(A\) is \(100 \times 100\)

We first define \(k\) and the dimension \(n\).

k = 50
n = 100

Consider the matrix-vector product:

How many multiplications are used?

How many additions are used?

Hint: For the dot product between two vectors in \(\mathbb{R}^n\) we use \(n\) multiplications and \(n-1\) additions.

Define the result in the following cell.
Hint: In Python, \(x^a\) is written as x**a.

mv_mul = "INSERT CODE HERE"
mv_add = "INSERT CODE HERE"
mv_ops = mv_mul + mv_add

Consider the same for a matrix-matrix product and define the result.

mm_mul = "INSERT CODE HERE"
mm_add = "INSERT CODE HERE"
mm_ops = mm_mul + mm_add

How many matrix-matrix and matrix-vector products, respectively, are used to compute \((A^{k+1})\boldsymbol{x}_0\) directly?

Use this to find the total number of operations in the computation and define the result.

direct_total = "INSERT CODE HERE"

Similarly, consider the number of matrix-matrix and matrix-vector products used for the iterative method \(\boldsymbol{x}_{k+1}=A(A(...A(A\boldsymbol{x}_0)))\).

Define the total number of operations.

iterative_total = "INSERT CODE HERE"

The following cell prints the two results and the ratio between them.