Getting started with `marlin`

Author

Echelle Burns

This workshop will provide you with an introduction to the marlin package, created by Dan Ovando.

Prerequisites

We will be using the marlin package, which we can install from GitHub using devtools.

install.packages("devtools")
devtools::install_github("DanOvando/marlin")

Because marlin is built using C++ on the back end, there are sometimes installation issues that occur. There are several GitHub issues in the marlin repository for troubleshooting this issue, so first see if your error has already been experienced. Otherwise, feel free to add a new issue to the repository.

Here are some of the install issues with solutions:

If you are getting compilation errors or API rate limit errors: see this issue and this issue
If you have an Apple M1 chip and are getting a clang error: see issue
If you are getting errors about not having the necessary tools to compile a package: see issue

If you have a Mac, one of the most sure fire ways to get marlin to load is to use macrtools and run through the following script:

# install.packages("remotes")
remotes::install_github("coatless-mac/macrtools")

# Remove all rtools first (xcode, gfortran, etc.)
macrtools::macos_rtools_uninstall()

# Re-install all rtools
macrtools::macos_rtools_install()

We will also be using tidyverse, sf, terra, and plot.matrix.

install.packages(c("tidyverse", "sf", "terra", "plot.matrix"))

Let’s go ahead and load the relevant packages now.

library(marlin)
library(tidyverse)
library(sf)
library(terra)
library(plot.matrix)

Data gathering

In this tutorial, we will be modeling the Stoplight Parrotfish (Sparisoma viride) population in St. Croix (shout out to Lewis for the species selection!).

Lewis with his favorite fish friends

marlin requires a lot of data and requires you to be diligent in making sure the data you import is accurate and consistent. We’re going to spend a lot of time setting up the model, and very little time actually running it!

Note that marlin can easily create a multi-species and multi-fleet model. The world is your oyster, but be aware of potential impacts to processing time.

Model parameters

We first need to define model parameters. These include: the model region, the model resolution, the model coordinate reference system (CRS), the number of total time steps, and the number of seasons.

One note about CRS - marlin often thinks in terms of square kilometers, so it’s best to choose a CRS that allows for measurements in meters (rather than degrees). I found the CRS for this example by Googling the CRS for St. Croix.

For our example, we will use the following:

# Define model region in latitude/longitude
model_region_4326 <- sf::st_bbox(c(xmin = -65.02, xmax = -64.4, 
                              ymin = 17.58, ymax = 17.89), 
                              crs = "epsg:4326")

# Define desired coordinate reference system 
# This should be a crs that allows for measurements in *meters*
crs <- "epsg:8118"

# Define model region using the desired CRS
model_region <- sf::st_transform(model_region_4326, crs = crs)

# Define resolution in meters
spat_resolution <-  5000

# Define total time steps
steps <- 50

# Define total number of seasons; if species showed seasonal differences
# in habitat preference, or aggregated to spawn, we would model more seasons
seasons <- 1

From these values, we can define more values that are necessary for marlin. These include defining the model area and cell size, and making several templates for the model region (raster, data.frame). We will do this below by first defining the model area as a raster:

# Define raster
model_raster <- terra::rast(xmin = model_region[["xmin"]], 
                            xmax = model_region[["xmax"]], 
                            ymin = model_region[["ymin"]], 
                            ymax = model_region[["ymax"]], 
                            crs = crs, 
                            resolution = spat_resolution, vals = 1)

To make our lives easier down the line, we also want to have a data.frame that we can use so that we make sure our latitude/longitudes match perfectly with the matrix x and y IDs. We’ll do that below:

# Create a data.frame template as well so that we can convert our 
# x and y cell locations into true coordinates
model_df_template <- model_raster %>% 
  terra::as.data.frame(., xy = TRUE) %>% 
  arrange(x, y)

y_matrix_df <- data.frame(y = model_df_template %>% 
                            distinct(y) %>% 
                            pull(y)) %>% 
  mutate(y_matrix = 1:nrow(.))

x_matrix_df <- data.frame(x = model_df_template %>% 
                            distinct(x) %>% 
                            pull(x)) %>% 
  mutate(x_matrix = 1:nrow(.))

model_df_template <- model_df_template %>% 
  select(x, y) %>% 
  left_join(y_matrix_df) %>% 
  left_join(x_matrix_df)

Finally, let’s create a list of model parameters, so that we can easily reference it later:

# Return list of model parameters
model_parameters <- list(resolution = rev(dim(model_raster)[1:2]),
                         num_patches_side1 = dim(model_raster)[2], 
                         num_patches_side2 = dim(model_raster)[1], 
                         patches = prod(dim(model_raster)[1:2]), 
                         patch_area_km2 = (spat_resolution/1000)^2,
                         patch_side_km = spat_resolution/1000, 
                         simulation_area_km2 = prod(dim(model_raster)[1:2])*(spat_resolution/1000)^2, 
                         seasons = seasons, 
                         years = steps)

We’re making this easy on ourselves today and not modeling an area that crosses the antimeridian. That could be it’s own tutorial! If you find yourself needing to use marlin for a region that crosses the antimeridian, don’t hesitate to reach out! I have a ton of code that does just that and will save you a few headaches.

Spatial layers

Another thing that we’ll need to supply are spatial layers - these include fishing grounds, landing sites, and proposed Marine Protected Areas (MPAs).

We won’t be creating landing sites today, but you can follow a similar methodology to add them for your own work. If supplied, marlin will use landing sites to calculate costs associated with traveling to/from landing sites and fishers will prioritize areas with high biomass that are closer to landing sites.

Fishing grounds

For this example, we will allow fishers to fish within the entire model area, apart from areas that are on land or areas that are in existing MPAs. We supply a data.frame to marlin of logical fishing areas that correspond to the x and y grid IDs of the model region.

We can start by reading in our existing MPA and rasterizing the layer of the non-protected areas. I use a subset of data gathered from MPAtlas that has been saved in our data/processed folder.

# Grab existing MPAs, convert to the appropriate CRS, and make valid
existing_mpas <- st_read(here::here("data", "processed", "existing_mpas.gpkg"), 
                         quiet = T) %>%
  st_transform(., crs) %>%
  st_make_valid()

# Create a raster that gives existing MPAs a value of NA and open areas a value of 1
non_mpa_raster <- terra::rasterize(existing_mpas %>% 
                                     st_crop(., model_raster),
                                   model_raster, field = NA, 
                                   background = 1)

Now, we can mask the non_mpa_raster by the St. Croix land file we have in our data/processed folder.

# Read in land 
land <- st_read(here::here("data", "processed", "st_croix_land.gpkg"), 
                quiet = TRUE) %>% 
  st_transform(., crs)

# Mask the non_mpa_raster by land 
fishing_ground_raster <- non_mpa_raster %>% 
  terra::mask(., land, inverse = TRUE)

# Rename the layer for this raster
names(fishing_ground_raster) <- "fishing_grounds"

We can convert this into a data.frame, which is what marlin is looking for in respect to fishing grounds.

# Convert to dataframe
fishing_grounds_df <- as.data.frame(fishing_ground_raster, 
                                    xy = TRUE, na.rm = FALSE) %>%
  left_join(model_df_template) %>%
  mutate(fishing_grounds = ifelse(is.na(fishing_grounds), FALSE, TRUE)) %>%
  select(-x, -y) %>%
  rename(x = x_matrix,
         y = y_matrix)

Proposed MPAs

We can now move on to proposed MPA areas. I’ve just quickly created a new proposed MPA under the assumption that St. Croix would want to expand a previously-existing MPA. We will use the same methodology we used for the fishing grounds layer:

proposed_mpa <- st_read(here::here("data", "processed", "new_mpa.gpkg"), quiet = T) %>%
  st_transform(., crs) %>%
  st_make_valid() 

proposed_mpa_raster <- terra::rasterize(proposed_mpa %>% 
                                     st_crop(., model_raster),
                                   model_raster, field = 1, background = NA)
names(proposed_mpa_raster) <- "mpa"

proposed_mpa_df <- as.data.frame(proposed_mpa_raster, xy = TRUE, na.rm = FALSE) %>%
  left_join(model_df_template) %>%
  mutate(mpa = ifelse(is.na(mpa), FALSE, TRUE)) %>%
  select(-x, -y) %>%
  rename(x = x_matrix,
         y = y_matrix)

Fish parameters

Now that we have our overall model parameters completed, we’ll need to define some fish parameters. These include:

Life-history characteristics of the model species
An idea of movement for the model species
Spatial habitat preference or abundance layers for the model species

Each of these will take some time, effort, and lots of coffee (or tea).

Life-history characteristics

Let’s dig into the life-history characteristics. In an ideal world, we would be able to find all the life-history characteristics of our model species from a recent stock assessment in the region of interest. We know that’s not always possible, so we will work with what we have and marlin can help us if we get stuck. In the back-end, marlin uses the FishLife package to fill any life-history parameters that we may be missing. Beware, though, as this can create some “frankenfish” if we’re not careful. We want to make sure that the life-history values that are supplied are consistent in both units and spatial origin (something that FishLife doesn’t check for when pulling data). If we can’t find estimates that match perfectly, we should try to find estimates from studies with similar methods in nearby regions.

For this example, I’ve grabbed the relevant life-history characteristics for Stoplight Parrotfish that are referenced in the US Caribbean Stoplight Parrotfish – St. Croix Assessment Report (SEDAR 2025). Many of these values originated from Shervette et al., (2024). The assessment report has results from several models, so we will use values from the most comprehensive model listed in the report (m3 v7, a model that includes hermaphroditism, adjusted age at t0, continuous recruitment, and catch uncertainty and includes fishery-independent length data and recruitment deviations). Note that for simplicity, this marlin run does not account for hermaphroditism; if you’re modeling a hermaphroditic species or a sexually dimorphic species, you would probably want to build two separate species models - one for males and one for females.

# Define life-history characteristics, take note of the units and the source
lh <- list(
  linf = 33.8, # mm converted to cm fork length from Shervette et al., 2024
  vbk = 0.33, # unitless from Shervette et al., 2024
  t0 = -0.52, # unitless from Shervette et al., 2024
  max_age = 20, # years from SEDAR 2025
  weight_a =  3.18e-5, # for weight in kg and length in cm, from SEDAR 2025
  weight_b = 2.90, # for weight in kg and length in cm, from SEDAR 2025
  length_50_mature = 15.3, # mm converted to cm fork length from Shervette et al., 2024
  age_50_mature = 1.6, # years from Shervette et al., 2024
  m = 0.27, # from SEDAR 2025
  steepness = 0.99, # from SEDAR 2025 
  r0 = 4.61*1000, # mt converted to kg, from SEDAR 2025 
  ssb0 = 37.67*1000, # mt converted to kg, from SEDAR 2025
  fished_depletion = 0.74 # from SEDAR 2025
)

We define as many life history variables as we can above, but note that there are many more that we could provide if we had the data. See ?Fish for a full list of parameters.

One thing to note is that the reported steepness parameter is very high (0.99). This means that the number of recruits is virtually independent from the spawning stock biomass; the number of recruits each year will be constant despite changes to spawning stock biomass. Since this is reported in the most recent assessment, we’ll keep it here, but this is something to keep in mind if the results of the model are not what we would expect based on local knowledge of the population.

Species movement - home range

Next, we need to find an appropriate home range for the species. We can often find this in the literature from telemetry studies. marlin uses home range to predict a “diffusion” rate, or an average distance that an animal is expected to move over one time step. marlin can account for species movement in three ways 1) “taxis”, or the direct movement from less preferred habitats to more preferred ones (derived from the habitat matrix that we’ll supply later), 2) “diffusion” (derived from the home range parameter), and 3) “advection”, or movement associated by drifting with the currents. Advection is currently set to 0 in this version of marlin. Notice that the diffusion parameter should not include directed movements to preferred habitats - when we’re searching the literature for a home range, we want to make sure that the value we use was not calculated from individuals that were translocated (moved to a different site to see if they would return to their capture location) or that were undergoing migrations to/from habitats for activities like spawning. Another important thing to note is that the home range that marlin is actually a linear distance, and is used to determine the distance from the starting point that roughly 95% of fish are within after one time step. However, most home ranges reported in the literature are areas. When we add the home range value to marlin, we need to make sure that it has the same units as our model area (here, km) and is a linear distance. If we envision the literature reported home range area as a circle, we can convert it to a linear distance using the square root of the quotient of the reported home range area divided by pi. For more information see this issue.

After a quick search of Google Scholar, I was unable to find a home range estimate for Stoplight Parrotfish in La Croix. I did, however, find estimated territory ranges of Stoplight Parrotfish in the Florida Keys. While not ideal, we can use this estimate as a starting point (and low end estimate) to get marlin running. If we were to use this for a real research project, we would probably want to run a sensitivity analysis with different home range values to see how much our results would change if our selected home range was wrong.

Smith et al., (2023) find that some Stoplight Parrotfish had territory sizes up to 282.6 $m^2$, so we’ll use that value for now. We want to make sure that the home range we supply is a linear distance and is the same units that we use for our model region (km).

home_range <- sqrt(282.6/pi)/1000

Habitat preferences

Finding appropriate habitat layers starts getting a little tricky, but there are many ways to find something that works. Ideally, you will have a heatmap of occurrence for the species of interest in your model region and you will be able to directly pull the data and use it as a habitat layer. Other suitable options include: if you have geospatial catch data - you can assume that areas where catch is highest are preferred habitats (though this also assumes fishers have perfect knowledge of biomass), if you have habitat data and you know your species favors particular habitats - you can assume distances further away from the habitat of interest have lower biomass.

For this example, we know Stoplight Parrotfish are reef-associated. So, we pulled reef location data from the Allen Coral Atlas, rasterized the data using our model region parameters, and calculated distances for each cell from the reef. We then standardized values between 0 and 1, where values of 1 are close to or on top of the reef and values of 0 are furthest away from the reef. Land areas are NAs. It’s very important that land is represented as NA, as NA indicates the animal cannot go to that area, while 0 indicates that the animal could theoretically swim through the area, but does not prefer it.

Note that our habitat layer that is saved in data/processed is already in the proper CRS and resolution for our model region. This is not the norm; you’ll usually have to reproject and resample the data to make sure it matches our model parameters.

# Load in the habitat dataframe
habitat_df <- read.csv(here::here("data", "processed", "habitat_suitability.csv"))

# Convert to raster
habitat_raster <- terra::rast(habitat_df, type = "xyz", crs = crs)

# Convert to matrix 
habitat_matrix <- matrix(habitat_df %>% 
                           arrange(x, desc(y)) %>% 
                           pull(habitat_suitability), 
                         ncol = dim(habitat_raster)[2], 
                         nrow = dim(habitat_raster)[1])

Now, we want to make sure that the way our data go into marlin are the way that marlin expects them. We can test this by plotting the matrix using plot.matrix and the data.frame version of the matrix joined to the x and y cell locations.

# Test our outputs - this should be the same orientation that we would see on something like Google Maps
terra::plot(habitat_raster, type = "interval")

The raster output showing the preferred habitats of Stoplight Parrotfish.

# Test our matrix - this should match how our raster plot looks
plot(habitat_matrix)

The matrix output showing the preferred habitats of Stoplight Parrotfish.

# Test against our dataframe - using the x and y grid cell ids 
habitat_df %>% 
  left_join(model_df_template) %>% 
  mutate(binned_suitability = round(habitat_suitability/0.2)*0.2) %>% 
  ggplot() + 
  geom_tile(mapping = aes(x = x_matrix, y = y_matrix, 
                          fill = binned_suitability)) + 
  coord_sf()

The data.frame output showing the preferred habitats of Stoplight Parrotfish.

The orientation seems right in each of these plots!

One fun thing to note here is that marlin does allow for the incorporation of climate change via multiple habitat layers (one per step). So, if you have the data handy, you can provide marlin with a list of habitat matrices for each projected time step. See the Set Dynamic Habitats vignette for more information.

Thinking about recruitment

We also need to think about what the most appropriate recruitment parameters are. First, we need to decide the density dependence for recruits. There are currently five options:

“global_habitat” - density dependence operates on the total spawning stock biomass summed across all patches and recruits are distributed according to the habitat layer
“local_habitat” - density dependence is local and recruits stay in the patch they were produced (there’s no movement)
“pre_dispersal” - density dependence is local to the spawning patch based on the local spawning stock biomass, but recruits are distributed according to the habitat layer
“post_dispersal” - larvae disperse according to the habitat layer and density dependence occurs at the destination patch based on local spawning stock biomass
“global_ssb” - density dependence operates on the total spawning stock biomass summed across all patches and recruits are allocated in proportion to the spatial distribution of the spawning stock biomass (not the habitat layer)

Given that Stoplight Parrotfish have a larval period, maybe using “post_dispersal” is the right call here - larvae can move about and settle in the most preferred habitats, where competition occurs based on biomass to see how many recruits survive to make it to the fishery.

density_dependence <- "post_dispersal"

For more information on these recruitment parameters, see here.

Another thing we need to think about is the dispersal distance (or the “home range”) of these recruits. Robertson et al., (2006) and Robertson and van Tassel (2015) report Stoplight Parrotfish larvae can have a pelagic duration fo 47-80 days. Modeling efforts of stochasticity in larval connectivity, using the California coast as a case study, estimate that settlement could occur upwards of 500 km from the origin location (Seigel et al., 2008]. There are no current studies of larval connectivity in St. Croix, so this is another variable that we will need to “guesstimate” and do a sensitivity analysis. For now, let’s assume a larval home range of 250 km (somewhere in the middle).

recruit_home_range <- 250

If we find that our results aren’t what we expect, this might be the first thing we re-evaluate.

Create the `fauna` object

Now we can create our fauna object for marlin. Here, we specify all the life-history characteristics, movement data, and habitat preferences. One important thing to note here is that if we don’t have separate habitat matrix layers for adults and recruits to the fishery, we need to specify the habitat matrix that we have for both the adults and the recruits, otherwise marlin will let recruits spawn on land and then they will get stuck there (sad recruits).

fauna <- list("stoplight" = create_critter(scientific_name = "sparisoma viride", 
                                           adult_home_range = home_range, 
                                           recruit_home_range = recruit_home_range, 
                                           habitat = habitat_matrix, 
                                           recruit_habitat = habitat_matrix,
                                           density_dependence = density_dependence, 
                                           linf = lh$linf,
                                           vbk = lh$vbk,
                                           t0 = lh$t0,
                                           max_age = lh$max_age,
                                           weight_a =  lh$weight_a,
                                           weight_b = lh$weight_b,
                                           length_50_mature = lh$length_50_mature,
                                           age_50_mature = lh$age_50_mature,
                                           m = lh$m,
                                           steepness = lh$steepness,
                                           r0 = lh$r0,
                                           ssb0 = lh$ssb0,
                                           fished_depletion = lh$fished_depletion, 
                                           resolution = model_parameters$resolution, 
                                           patch_area = model_parameters$patch_area_km2))

We can check that our fauna object looks okay by plotting our values at age:

fauna$stoplight$plot()

We can also look at the movement plots for our defined fauna:

fauna$stoplight$plot_movement()

Here, we can see that the post-recruits (adults) in the fishery typically stay their starting position, which matches well with what we would expect for a reef-associated species with a small home range. For the recruits, “movement” relies on density dependence. The recruit plot shows the proportion of the total number of recruits that are in each cell. In our example, these values are very small so they all look like 0s when the color scale ranges from 0-1. We can plot this by itself if we’re curious, but it should look similar to the habitat layer we provide, since the habitat layer is what is used to spatially distribute the number of recruits.

model_df_template %>% 
  mutate(r0s = fauna$stoplight$r0s, 
         r0s_scaled = r0s/sum(r0s)) %>% 
  ggplot() + 
  geom_tile(mapping = aes(x = x_matrix, y = y_matrix, fill = r0s_scaled)) + 
  scale_fill_viridis_c() + 
  labs(fill = "Proportion of recruits") + 
  coord_sf() + 
  theme_void() + 
  theme(legend.position = "bottom", 
        legend.title = element_text(vjust = 0.8), 
        legend.text = element_text(angle = 45, hjust = 1))

Rescaled recruit movement plot, which is essentially the spatial distribution of recruits and is dependent on the habitat matrix that is supplied.

Fleet parameters

Fleet selectivity

Fishing for Stoplight Parrotfish in St. Croix can be done via spearfishing, by hand via diving, or by pots and traps. The stock assessment report does not differentiate between these “commercial” fisheries, so we will not either. The assessment report provides selectivity curves for each model that they ran for the “commercial” fishery:

Selectivity curves for Stoplight Parrotfish provided by SEDAR, 2025

marlin allows you to enter selectivity at age parameters for dome-shaped, logistic, or double-normal curves. However, I find it easier to supply a manual selectivity curve so that I’m sure marlin interprets the selectivity parameter the way I want it to. I used ImageJ to grab the x and y coordinates for the selectivity curves presented in SEDAR (2025) and saved them to our data/processed folder. We can read them in now.

selectivity <- read.csv(here::here("data", "processed", "selectivity_curve.csv")) %>% 
  rename(length_cm = X, 
         selectivity = Y)

To convert this into a selectivity curve marlin can understand, we want to create a smoothed line from our point estimates, so that we can select any x coordinate and retrieve a y value. We can do this by using smooth.spline():

smoothed_selectivity <- smooth.spline(selectivity$length_cm, 
                                      selectivity$selectivity, 
                                      spar = 0.10)

Let’s take a look at this selectivity curve to make sure it matches the one from SEDAR (2025) by predicting every length between 0 and 40 cm:

predict_all_lengths <- predict(smoothed_selectivity, 0:40)

ggplot() +
  geom_line(mapping = aes(x = predict_all_lengths$x, 
                          y = predict_all_lengths$y)) + 
  labs(x = "Length (cm)", y = "Selectivity")

That looks pretty close!

Now, the selectivity curve from SEDAR (2025) is for lengths, but marlin thinks about selectivity by age. So we’ll have to pull selectivity values for the lengths that correspond to each age class in our fauna object. We can do this by predicting selectivity values across fauna$stoplight$length_at_age:

selectivity_values <- predict(smoothed_selectivity, 
                              fauna$stoplight$length_at_age)[["y"]]

We also want to make sure that the selectivity values don’t go beyond 0 and 1 (this could happen if I clicked a little bit too high/low on the curve in ImageJ), so we can manually adjust those values here:

selectivity_values[(selectivity_values < 0) | is.na(selectivity_values)] <- 0
selectivity_values[(selectivity_values > 1)] <- 1

If we wanted to model many species or many fisheries, we would need selectivity curves for each species and fishery combination.

Relative exploitation rate

We also need to supply a relative exploitation rate (p_explt) for each species that a fleet targets. When we’re only modeling one species and one fleet, p_explt = 1. However, when multiple fleets target the same species, we would need to think about our p_explt values. For example, if fleet A and fleet B are both fishing for Stoplight Parrotfish, but fleet A is responsible for 2/3 of the total catch of the species, reasonable p_explt values would be 2 for fleet A and 1 for fleet B. The best way to figure out which p_explt values to use would be to find landings data and calculate the relative contribution of each fleet to a species’ total landings. If a species is caught at all by a fleet, even as bycatch p_explt > 0.

We’ll use a p_explt of 1 for our example:

p_explt <- 1

Landing price

We also need to specify a price per species, so that a fleet that targets multiple species knows which species to prioritize catching. For example, if a fleet targets 3 species, but would rather catch species A because it’s more valuable, the price for that species should be higher than the price for others. The best way to find these values is by looking up ex-vessel prices for each of the species you’re modeling (ideally specific to the model region). Just make sure that the price is relative to the units that you’ve used for your fauna parameter. Since we’ve specified values in kg, we would want our price to be in USD/kg.

Callwood et al., (2021) report that parrotfish in the Bahamas can sell for 2.21 to 44.09 USD per kg. We’ll choose a value in the middle - 18 USD/kg.

price <- 18

The price parameter is where you could best account for true bycatch species, where there is no true economic incentive for catching the species (price = 0). You can also include negative prices to indicate that fishers want to to avoid catching that species; this is useful for protected species that are subject to penalties if landed.

How should our fleet respond to management changes

Finally, we have to decide how our fleet should respond to changes in management. This includes:

What happens when an MPA is implemented? Do fishers stay in the fishery and redistribute to the open fishing areas? Or do they leave the fishery?
How do we expect fishers to fish? Is this an open access situation, or constant effort until something changes?
How should fishers distribute themselves spatially? Should it be by revenues from catch? Profits? Revenues per unit effort? Profits per unit effort?

For this example, we make the following decisions:

If an MPA is implemented, fishers will leave and their effort will be removed from the study area.
We expect fishers to fish under constant effort (until a management change takes place). Dan warns against using open access unless it’s absolutely necessary because it adds more levels of complexity that make interpreting results more difficult. Note that if you do choose to use open access, you will have to determine whether your fishery is currently in equilibrium and set the cost-to-revenue ratio (cr_ratio) as appropriate.
We expect fishers to spatially distribute based on revenues.

Create the `fleet` object

Now we have all the components we need to build our fleet object. Let’s build it using create_fleet():

fleet <- list(
  "commercial" = create_fleet(
    metiers =
      list(
        "stoplight" = Metier$new(
          critter = fauna$stoplight, 
          price = price, 
          sel_form = "manual", 
          sel_at_age = selectivity_values, 
          p_explt = p_explt)), 
    mpa_response = "leave", 
    fleet_model = "constant_effort",
    spatial_allocation = "revenue",
    resolution = model_parameters$resolution, 
    patch_area = model_parameters$patch_area_km2, 
    fishing_grounds = fishing_grounds_df
  )
)

Once the fleet is created, we need to tune it. We will have marlin tune the fleet to meet the depletion level that we supplied in the fauna object. If we specified an initial exploitation rate in our fauna object, we could tune it to that instead by setting tune_type to “explt”.

tuned_fleet <- marlin::tune_fleets(fauna = fauna, 
                                   fleets = fleet,
                                   tune_type = "depletion")

Phew! Ok, now we have all of our components ready to run the model!

Running the model

Now that we can run the model, here’s a quick schematic of how marlin thinks:

We can start by running a business as usual (BAU) model. First, we want to make sure that we’re incorporating any existing management measures in our BAU model and in our MPA model. SEDAR (2025) mentions that there is an annual catch limit of 72,365 lbs (~32,824 kg). This is actually spread across two parrotfish species, but we’ll just keep it as is for now since we don’t know what proportion of total catch the Stoplight Parrotfish make up. We can specify this in our model using the manager argument in simmar(). Creating a model in marlin requires to parts: first simmar() creates the simulation, then process_marlin() processes the results so that we can better understand them. We do both below for BAU:

# BAU model run
sim_bau <- simmar(fauna = fauna, 
                  fleets = tuned_fleet, 
                  years = model_parameters$years, 
                  manager = list(quotas = list(stoplight = 32824)))

processed_bau <- process_marlin(sim = sim_bau, 
                                time_step = 1, # fraction of a year per step
                                keep_age = FALSE) # aggregate all ages in results

Now, for our MPA model, we want to add an MPA to the manager argument and decide in which year the MPA is implemented. Note that because marlin is created in the back-end by C++, marlin uses 0-indexing, but we (and R) intuitively use 1-indexing. So, if we want an MPA implemented in year 5 of the model, we need to tell marlin to actually implement it in year 6.

# MPA is implemented in year 5
sim_implemented <- simmar(fauna = fauna, 
                          fleets = tuned_fleet, 
                          years = model_parameters$years, 
                          manager = list(quotas = list(stoplight = 32824), 
                                         mpas = list(locations = proposed_mpa_df, 
                                                     mpa_year = 6)))

processed_implemented <- process_marlin(sim = sim_implemented, 
                                        time_step = 1, # fraction of a year per step
                                        keep_age = FALSE) # aggregate all ages in results

Analyzing results

Now that we’ve run our model, we can start looking at the outputs. marlin has some nifty plot functions that we can use, but let’s first take a look at what the outputs look like, so that we know how to make our own plots from scratch if we want to. We’ll use the BAU run as an example:

processed_bau

$fauna
# A tibble: 85,800 × 13
   critter   patch     x     y age       n     b   ssb     c  step  year season
   <chr>     <int> <int> <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int>  <int>
 1 stoplight     1     1     1 all    172.  15.4  13.7  3.95     1     1      1
 2 stoplight     2     1     2 all    182.  16.4  14.6  4.23     1     1      1
 3 stoplight     3     1     3 all    192.  17.2  15.3  4.48     1     1      1
 4 stoplight     4     1     4 all    201.  18.0  16.0  4.70     1     1      1
 5 stoplight     5     1     5 all    208.  18.6  16.6  4.88     1     1      1
 6 stoplight     6     1     6 all    214.  19.2  17.0  5.03     1     1      1
 7 stoplight     7     1     7 all    218.  19.5  17.4  5.15     1     1      1
 8 stoplight     8     1     8 all    220.  19.8  17.6  5.21     1     1      1
 9 stoplight     9     1     9 all    221.  19.8  17.6  5.23     1     1      1
10 stoplight    10     1    10 all    227.  20.4  18.1  5.40     1     1      1
# ℹ 85,790 more rows
# ℹ 1 more variable: mean_length <dbl>

$fleets
# A tibble: 85,800 × 14
   critter  patch     x     y age   fleet catch revenue effort  cpue  step  year
   <chr>    <int> <int> <int> <chr> <chr> <dbl>   <dbl>  <dbl> <dbl> <dbl> <int>
 1 stoplig…     1     1     1 all   comm…  3.95    71.1   1.04  3.80     1     1
 2 stoplig…     2     1     2 all   comm…  4.23    76.1   1.05  4.03     1     1
 3 stoplig…     3     1     3 all   comm…  4.48    80.6   1.06  4.24     1     1
 4 stoplig…     4     1     4 all   comm…  4.70    84.5   1.06  4.42     1     1
 5 stoplig…     5     1     5 all   comm…  4.88    87.9   1.07  4.58     1     1
 6 stoplig…     6     1     6 all   comm…  5.03    90.6   1.07  4.70     1     1
 7 stoplig…     7     1     7 all   comm…  5.15    92.6   1.07  4.79     1     1
 8 stoplig…     8     1     8 all   comm…  5.21    93.8   1.08  4.84     1     1
 9 stoplig…     9     1     9 all   comm…  5.23    94.2   1.08  4.86     1     1
10 stoplig…    10     1    10 all   comm…  5.40    97.1   1.08  4.99     1     1
# ℹ 85,790 more rows
# ℹ 2 more variables: season <int>, mean_length <dbl>

Our output is a list, with tibbles for fauna (which includes species-, patch-, and year-level values of number of individuals, biomass, spawning stock biomass, and catch) and fleets (which includes species-, fleet-, patch-, and year- level values for catch, revenue, effort, and catch per unit effort). Note that these outputs are the sums of all age classes because used keep_age = FALSE in the process_marlin() above.

Now we can jump into using plot_marlin() to see some outputs of these results.

First, let’s check our spatial results at year 1 - they should look identical. They should show no catch in existing MPAs and catch proportional to the habitat matrix otherwise:

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "c", 
            plot_type = "space", 
            steps_to_plot = 1, 
            max_scale = FALSE) + 
  coord_sf()

Spatial `marlin` output for catch (kg) from our BAU and MPA-implemented model runs at year 1.

Now again at year 30 - they should look different now. The BAU result should allow fishing in the same regions as it did in year 1, but the MPA scenario should show no catch in the newly proposed MPA area:

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "c", 
            plot_type = "space", 
            steps_to_plot = 30, 
            max_scale = FALSE) + 
  coord_sf()

Spatial `marlin` output for catch (kg) from our BAU and MPA-implemented model runs at year 30.

We can also use plot_marlin() to get time series plots. Let’s look at catch over time:

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "c", 
            max_scale = FALSE) + 
  geom_vline(xintercept = 5, linetype = "dashed")

Catch (kg) of Stoplight Parrotfish over time, under BAU and after an MPA is implemented in year 5.

Let’s also look at spawning stock biomass over time:

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "ssb", 
            max_scale = FALSE) + 
  geom_vline(xintercept = 5, linetype = "dashed")

Spawning stock biomass (kg) of Stoplight Parrotfish over time, under BAU and after an MPA is implemented in year 5.

We see that catch (and spawning stock biomass) both start really high, which leads to a population crash. The population then begins to recover/stabilize as catch rates flatted out at a lower level. However, we notice that both the BAU and MPA scenarios show this same trend. For our example, we assume that the state of the fishery is in equilibrium under BAU - which means we want our BAU lines to be incredibly stable. We discuss how to achieve this in the next section.

Before we get into that, we will often want to use this as a means to see how total revenues change with the implementation of on MPA. This isn’t something that we can plot using plot_marlin(), so let’s plot it manually below. For this model run, since we didn’t specify landing sites, marlin calculates revenues as the product of catch and the supplied price variable.

revenues <- processed_bau$fleets %>% 
  mutate(model = "BAU") %>% 
  bind_rows(processed_implemented$fleets %>% 
              mutate(model = "Implemented")) %>% 
  group_by(year, model) %>% 
  summarise(total_revenue = sum(revenue, na.rm = T)) %>% 
  ungroup() 

ggplot() + 
  geom_line(data = revenues, 
            mapping = aes(x = year, y = total_revenue, color = model), 
            linewidth = 2) +  
  scale_color_manual(values = c("BAU" = "black", "Implemented" = "gray")) + 
  geom_vline(xintercept = 5, linetype = "dashed") + 
  labs(color = NULL, 
       x = "Year", 
       y = "Total revenue (USD)") + 
  scale_y_continuous(limits = c(0, NA), labels = scales::comma) + 
  theme_marlin() + 
  theme(legend.position = "top")

Total revenue per year over time, under BAU and after an MPA is implemented in year 5.

It’s hard for us to intrepret these results when we can see the model is not in equilibrium (like we want). So let’s move on and run the model under equilibrium conditions.

Adjusting the model to meet starting equilibrium conditions

We want our BAU model to reflect that the fishery is in equilibrium and that results will be stable over time. One way to achieve this is by adding a manual “burn in” period, under which time models are allowed to stabilize before an MPA is implemented. To do this, we will essentially run the model for additional years and discard the years that we designate as “burn in” years. The population under BAU began to stabilize around year 40 in the above example, so we’ll add that many additional years to our model.

burn_years <- 40

We will now re-run our model using the adjusted time periods:

# BAU model run
sim_bau <- simmar(fauna = fauna, 
                  fleets = tuned_fleet, 
                  years = burn_years + model_parameters$years, 
                  manager = list(quotas = list(stoplight = 32824)))

processed_bau <- process_marlin(sim = sim_bau, 
                                time_step = 1, # fraction of a year per step
                                keep_age = FALSE) # aggregate all ages in results 

# MPA is implemented in year 5
sim_implemented <- simmar(fauna = fauna, 
                          fleets = tuned_fleet, 
                          years = burn_years + model_parameters$years, 
                          manager = list(quotas = list(stoplight = 32824), 
                                         mpas = list(locations = proposed_mpa_df, 
                                                     mpa_year = burn_years + 6)))

processed_implemented <- process_marlin(sim = sim_implemented, 
                                        time_step = 1, # fraction of a year per step
                                        keep_age = FALSE) # aggregate all ages in results

And then re-classify our years in the outputs:

processed_bau <- purrr::map(.x = 1:length(processed_bau), 
                            .f = ~processed_bau[[.x]] %>% 
                              mutate(year = year - burn_years, 
                                     step = step - burn_years) %>% 
                              filter(year >= 0))

names(processed_bau) <- c("fauna", "fleets")

processed_implemented <- purrr::map(.x = 1:length(processed_implemented), 
                            .f = ~processed_implemented[[.x]] %>% 
                              mutate(year = year - burn_years, 
                                     step = step - burn_years) %>% 
                              filter(year >= 0))

names(processed_implemented) <- c("fauna", "fleets")

We can now re-make the figures from before:

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "c", 
            max_scale = FALSE) + 
  geom_vline(xintercept = 5, linetype = "dashed")

plot_marlin('BAU' = processed_bau, 'Implemented' = processed_implemented, 
            plot_var = "ssb", 
            max_scale = FALSE) + 
  geom_vline(xintercept = 5, linetype = "dashed")

We now have results that look much more stable under BAU, and there’s a clear change to both catch and spawning stock biomass when the MPA is implemented. We can look at our revenue results again:

revenues <- processed_bau$fleets %>% 
  mutate(model = "BAU") %>% 
  bind_rows(processed_implemented$fleets %>% 
              mutate(model = "Implemented")) %>% 
  group_by(year, model) %>% 
  summarise(total_revenue = sum(revenue, na.rm = T)) %>% 
  ungroup() 

ggplot() + 
  geom_line(data = revenues, 
            mapping = aes(x = year, y = total_revenue, color = model), 
            linewidth = 2) +  
  scale_color_manual(values = c("BAU" = "black", "Implemented" = "gray")) + 
  geom_vline(xintercept = 5, linetype = "dashed") + 
  labs(color = NULL, 
       x = "Year", 
       y = "Total revenue (USD)") + 
  scale_y_continuous(limits = c(0, NA), labels = scales::comma) + 
  theme_marlin() + 
  theme(legend.position = "top")

Other cool `marlin` features, tips, and tricks

Dan is constantly working to improve marlin - making it faster, add new features, etc.

Modeling at coarse- and fine-scale resolutions simultaneously

One new feature that he’s recently implemented is a “moat” of coarser scale cells around a fine-scale area of interest. This can come in handy if you’re interested in modeling a species with a large distribution, but are primarily interested in the fine-scale results from a small section of the distribution area. We’re still working to put this into practice using real-world data (see issue), but there is a vignette on how to get started.

Modeling discard mortality

marlin currently does not allow for modeling discard mortality, where an animal is caught, released, and then has some probability of survival. If this is a feature you’re interested in, feel free to start a new issue in the marlin github repository. As is, the only way to model discards is to treat is as a normal species and remove the catch data for that species after running process_marlin(). However, this would still assume a 100% mortality rate of discards, because the model is running under the assumption that caught fish are landed.

If you expect an MPA to make a larger impact

A lot of this modeling is theoretical, and your analyses will benefit greatly from running several sensitivity analyses (e.g., home range, recruitment parameters, etc.). If, given your knowledge of the fishery and/or species modeled, you expect your MPA to result in spillover from protected areas to non-protected areas, you will need to tweak your input parameters. The most likely culprits are:

Your steepness value is too high - steepness values close to 1 indicate that the number of recruits is independent from the spawning biomass; if spawning biomass increases (or decreases), the number of recruits stay the same
Your larval home_range is too small - if the larvae are forced to occupy small areas, they will never have the opportunity to move out of the protected region; the number of recruits will constantly be limited by the density dependence within the protected area
Your density_dependence choice was wrong - if you choose a density dependence such as “local_habitat”, your larvae aren’t dispersing to new areas at all; again, they’re stuck where they spawn and are limited by the space available in that single area