---
title: "Introducing synopR"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introducing synopR}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
**2026-04-08**
```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


## Standard workflow

`show_synop_data()` is the package's core function. It requires a character vector or a data frame column where each element is a SYNOP string. It's vectorized, so large vectors can be processed in seconds. But, first of all, SYNOP messages should be checked with `check_synop()`. This function will make sure every message starts with "AAXX" and ends with "=", does not contain invalid characters (valid characters after removing "AAXX" and "=" are digits 0-9, '/' and 'NIL'), and verifies that all groups consist of 5 digits (except for the section identifiers '333' and '555'). It will return a data frame with two columns: a boolean column indicating the validity (can be used to filter out), and a second one pointing out possible errors (SYNOP messages with missing 'AAXX', '=', or with groups that don't respect the 5-digit format).
 
```{r}
library(synopR)

# Notice that the second SYNOP will be removed because of the incomplete group '8127'
data_input_vector <- c("AAXX 04003 87736 32965 00000 10204 20106 39982 40074 5//// 333 10266 20158 =",
                       "AAXX 03183 87736 32965 12708 10254 20052 30005 40098 5//// 80005 333 56000 8127 =",
                       "AAXX 03183 87736 32965 12708 10254 20052 30005 40098 5//// 80005 333 56000 81270 =")

checked <- check_synop(data_input_vector)
my_data <- show_synop_data(data_input_vector[checked$is_valid == TRUE])

knitr::kable(t(my_data))

```

All the columns associated with information not present in the SYNOP messages are removed by default. If for some reason you don't want that, set `remove_empty_cols = FALSE`.

The optional `wmo_identifier` argument allows for automatic filtering in case the data contains messages from different stations. If you are working with thousands of SYNOP strings from multiple stations, this built-in filtering becomes extremely convenient.


```{r}
library(synopR)
# Messages from 87736 and 87016
mixed_synop <- c("AAXX 01183 87736 12465 20000 10326 20215 39974 40064 5//// 60001 82100 333 56600 82818=",
                 "AAXX 04033 87016 41460 83208 10200 20194 39712 40114 50003 70292 888// 333 56699 82810 88615="
                 )

colorado_data <- show_synop_data(mixed_synop, wmo_identifier = '87736', remove_empty_cols = TRUE)
knitr::kable(t(colorado_data))
```


## Workflow with Ogimet

[Ogimet](https://www.ogimet.com) is a known and respectable source of SYNOP messages. `download_from_ogimet()` can be used to download SYNOP messages from this webpage. You will need the WMO identifier of the station of interest. The period of interest can't be longer than 370 days. Be aware that the result will contain prefixes added by Ogimet, with information regarding WMO id and date. However, this is not an issue, as we can employ `parse_ogimet()`. This tool is designed to separate these aggregates from the raw SYNOP message for processing (`show_synop_data()` will make use of these aggregates and will add the columns 'Year' and 'Month').

```{r}
library(synopR)

# Suppose we have downloaded this data with:
# download_from_ogimet("87736","2026-02-01","2026-02-01")
data_input <- data.frame(synops = c("87736,2026,02,01,03,00,AAXX 01034 87736 NIL=",
                                    "87736,2026,02,01,06,00,AAXX 01064 87736 NIL=",
                                    "87736,2026,02,01,09,00,AAXX 01094 87736 NIL=",
                                    "87736,2026,02,01,12,00,AAXX 01123 87736 12965 31808 10240 20210 39992 40082 5//// 60104 82075 333 10282 20216 56055 82360=",
                                    "87736,2026,02,01,15,00,AAXX 01154 87736 NIL=",
                                    "87736,2026,02,01,18,00,AAXX 01183 87736 12465 20000 10326 20215 39974 40064 5//// 60001 82100 333 56600 82818=",
                                    "87736,2026,02,01,21,00,AAXX 01214 87736 NIL="))

# Note that `parse_ogimet(data_input)` is incorrect
data_from_ogimet <- parse_ogimet(data_input$synops) 

# 'Year' and 'Month' column are included!
# 'NIL' messages are ignored
parse_ogimet(data_input$synops) |> show_synop_data() |> t() |> knitr::kable()
```

Here, we didn't make use of `check_synop()`. But, it must be said that a data frame with multiple columns —where the SYNOP column is not explicitly specified— will be accepted **if and only if that data frame is the direct output of** `parse_ogimet()`.

All these steps (download, parse, check and decode) are included in one single function, `direct_download_from_ogimet()`, which will return the direct decoded result.

## Limitations

A complete and detailed table with the meaning and details of all the columns returned by `show_synop_data()` is available in the vignette "Extracted data reference".

Code tables are available in the vignette "Code Tables" for direct official conversions.

### General limitations

* There is no support for sections 222 y 444. They will lead to a wrong result or function will crack.
* Group 555 (reserved for national distribution) is quietly ignored, as its content varies by country.
* No support for groups 9 from section 1 and 3.
* Group 29UUU (very rare) will lead to a dew point wrong result.
* Group 54 from section 3 (temperature change) is ignored.