Code
pacman::p_load(sf, tidyverse,jsonlite,dplyr,stringr,knitr,ggplot2,lubridate,ggiraph,viridis,plotly,gridExtra,readr)June 26, 2024
FishEye analysts face a challenge: they purchased port exit records instead of ship off-load records. These records do not contain information about which vessels delivered the products. Therefore, we need to develop a method to associate vessels with their probable cargo. This report details our matching method used to address this issue.
Our method consists of three main steps, gradually narrowing down the range of possible vessels:
Initial matching using Harbor_Reports_data
Further matching using TransponderPing data
Filtering unlikely vessels using Vessel Movement data
Objective: Identify vessels associated with each order by merging Transaction_data and Harbor_Reports_data based on date and port.
Method:
Left join Transaction_data with Harbor_Reports_data based on date and port
For each transaction, identify vessels reported at the corresponding date and port
Objective: For orders not matched to vessels in step 3.1, use TransponderPing data to identify potential vessels.
Method:
For records where ProbableVessel is NA, match using transaction date and port with TransponderPing data
Identify vessels with signals near the corresponding date and port
Insight: After completing this step, we found that all previously unmatched cases correspond to multiple possible vessels, indicating the need for further filtering.
Objective: Use vessel_movement_data to determine the paths of vessels associated with each transaction during the observation period, and filter out unlikely vessels from the list of potential candidates.
Method:
Historical Path Analysis:
Criteria for Identifying Vessels That Delivered Products:
Arrival Record: The vessel has a documented arrival at the port.
Previous and Next Port Records: The vessel has records of arriving from a previous port and/or departing to another port after unloading.
Identifying Unlikely Vessels:
Through this multi-step matching process, we were able to associate vessels with their probable cargo, overcoming the limitation of lacking direct ship off-load records in the original data. This method integrates port reports, transponder signals, and vessel movement data to construct a relatively comprehensive association picture.
Based on this matching method, we further developed an interactive visualization system, enabling analysts to visually explore and validate these associations.
p <- ggplot(Transaction_matchresult2_visual, aes(x = quarter, y = 0, fill = unmatch,
data_id = source.transaction,
tooltip = paste("Date:", date_added, "<br>",
"Source Transaction:", source.transaction, "<br>",
"Probable Vessel:", probable_vessel))) +
geom_point_interactive(size = 5, shape = 21, color = "white",
position = position_jitter(width = 0.4, height = 0.05)) +
scale_fill_manual(values = c("match" = "green", "unmatch" = "red")) +
labs(title = "Summary of Cargos Match across Date & Quantity",
x = "Transaction Quarter", y = NULL, fill = "Match Status") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 0, hjust = 1, size = 20),
axis.text = element_text(size = 20),
axis.title = element_text(size = 20),
panel.grid.major.x = element_line(color = "gray", linetype = "dashed"),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
legend.text = element_text(size = 20),
strip.text.y = element_text(size = 20, angle = 0),
legend.position = "top",
strip.background = element_rect(fill = "gray80", color = "gray50"),
panel.background = element_rect(fill = "gray90"), #
plot.title = element_text(size = 25, hjust = 0.5)) +
facet_grid(target ~ ., scales = "free", space = "free") +
guides(x = guide_axis(angle = 45))The Shiny App below provides an illustration of the matched delivery reports. Here, the analyst can choose to filter by the time frame (based on the quarters), the fish species, and the ports in Oceanus. The illegal fish species that we have specified are those only found in the Preserve Areas, namely: Sockfish, Helenaa, Offidiaa and Salmon (which is not found in any of the fishing region).
---
title: "Matching Reports"
date-modified: "last-modified"
format:
html:
code-fold: true
code-tools: true
toc: false
execute:
eval: true
echo: true
warning: false
freeze: true
---
# 1. Introduction
FishEye analysts face a challenge: they purchased port exit records instead of ship off-load records. These records do not contain information about which vessels delivered the products. Therefore, we need to develop a method to associate vessels with their probable cargo. This report details our matching method used to address this issue.
# 2. Overview of the Matching Method
Our method consists of three main steps, gradually narrowing down the range of possible vessels:
1. Initial matching using **Harbor_Reports_data**
2. Further matching using **TransponderPing data**
3. Filtering unlikely vessels using **Vessel Movement data**
# 3. Detailed Matching Process
## 3.1 Initial Matching Using Harbor_Reports_data
**Objective**: Identify vessels associated with each order by merging Transaction_data and Harbor_Reports_data based on date and port.
**Method**:
- Left join Transaction_data with Harbor_Reports_data based on date and port
- For each transaction, identify vessels reported at the corresponding date and port
## 3.2 Further Matching Using TransponderPing Data
**Objective**: For orders not matched to vessels in step 3.1, use TransponderPing data to identify potential vessels.
**Method**:
- For records where ProbableVessel is NA, match using transaction date and port with TransponderPing data
- Identify vessels with signals near the corresponding date and port
**Insight**: After completing this step, we found that all previously unmatched cases correspond to multiple possible vessels, indicating the need for further filtering.
## 3.3 Filtering Unlikely Vessels Using Vessel Movement Data
**Objective**: Use vessel_movement_data to determine the paths of vessels associated with each transaction during the observation period, and filter out unlikely vessels from the list of potential candidates.
**Method**:
1. Historical Path Analysis:
- Examine the routes of all vessels associated with the current record within 24 hours before and after the current record time.
2. Criteria for Identifying Vessels That Delivered Products:
- Arrival Record: The vessel has a documented arrival at the port.
- Previous and Next Port Records: The vessel has records of arriving from a previous port and/or departing to another port after unloading.
3. Identifying Unlikely Vessels:
- Exclude the following vessels:\
a) Vessels that did not significantly change their geographical location within the observation time window (potentially docked at the current port).\
b) Vessels showing frequent geographical movements, which may indicate tourism or sightseeing activities rather than cargo transport.
# 4. Initial Visualisation
Through this multi-step matching process, we were able to associate vessels with their probable cargo, overcoming the limitation of lacking direct ship off-load records in the original data. This method integrates port reports, transponder signals, and vessel movement data to construct a relatively comprehensive association picture.
Based on this matching method, we further developed an interactive visualization system, enabling analysts to visually explore and validate these associations.
```{r}
pacman::p_load(sf, tidyverse,jsonlite,dplyr,stringr,knitr,ggplot2,lubridate,ggiraph,viridis,plotly,gridExtra,readr)
```
```{r}
Transaction_matchresult2_visual <- read_rds("data/thresult.rds")
```
```{r}
p <- ggplot(Transaction_matchresult2_visual, aes(x = quarter, y = 0, fill = unmatch,
data_id = source.transaction,
tooltip = paste("Date:", date_added, "<br>",
"Source Transaction:", source.transaction, "<br>",
"Probable Vessel:", probable_vessel))) +
geom_point_interactive(size = 5, shape = 21, color = "white",
position = position_jitter(width = 0.4, height = 0.05)) +
scale_fill_manual(values = c("match" = "green", "unmatch" = "red")) +
labs(title = "Summary of Cargos Match across Date & Quantity",
x = "Transaction Quarter", y = NULL, fill = "Match Status") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 0, hjust = 1, size = 20),
axis.text = element_text(size = 20),
axis.title = element_text(size = 20),
panel.grid.major.x = element_line(color = "gray", linetype = "dashed"),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
legend.text = element_text(size = 20),
strip.text.y = element_text(size = 20, angle = 0),
legend.position = "top",
strip.background = element_rect(fill = "gray80", color = "gray50"),
panel.background = element_rect(fill = "gray90"), #
plot.title = element_text(size = 25, hjust = 0.5)) +
facet_grid(target ~ ., scales = "free", space = "free") +
guides(x = guide_axis(angle = 45))
```
```{r}
girafe(ggobj = p, width_svg = 25, height_svg = 15, options = list(
opts_sizing(rescale = TRUE)))
```
# 5. Shiny App - Matched Reports
The Shiny App below provides an illustration of the matched delivery reports. Here, the analyst can choose to filter by the time frame (based on the quarters), the fish species, and the ports in Oceanus. The illegal fish species that we have specified are those only found in the Preserve Areas, namely: Sockfish, Helenaa, Offidiaa and Salmon (which is not found in any of the fishing region).
<iframe src="https://ynforvaashiny.shinyapps.io/608project/" width="100%" height="800px" scrolling="no" frameborder="0">
</iframe>