Reading the ‘BOE’

class: center, middle, inverse, title-slide

.title[
# Reading the ‘BOE’
]
.author[
### Lluís Revilla Sancho <a href="https://twitter.com/Lluis_Revilla"><svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg></a>
]
.date[
### 2022-05-13
]

---

# BOE

Retrieve data from the official Spanish Gazette:

```r
library("BOE")
sumario_hoy <- retrieve_sumario(as.Date("2022/05/06")) # Or retrieve_sumario("BOE-S-2022-1")
colnames(sumario_hoy)
##  [1] "date"            "sumario_nbo"     "sumario_code"    "section"        
##  [5] "section_number"  "departament"     "departament_etq" "epigraph"       
##  [9] "text"            "publication"     "pages"
```

.center[

]

???

Daily summaries can be retrieved by date or CVE. 
Then it is easier to extract information for a publication.

---

## Examples

.pull-left[

Works for documents which allows to search in text:

```r
(CVE <- sumario_hoy$publication[1])
## [1] "BOE-A-2022-7418"
cat(colnames(retrieve_document(CVE)))
## identificador titulo diario diario_numero seccion subseccion departamento rango numero_oficial fecha_disposicion fecha_publicacion fecha_vigencia fecha_derogacion letra_imagen pagina_inicial pagina_final suplemento_letra_imagen suplemento_pagina_inicial suplemento_pagina_final estatus_legislativo origen_legislativo estado_consolidacion judicialmente_anulada vigencia_agotada estatus_derogacion url_epub url_pdf url_pdf_catalan url_pdf_euskera url_pdf_gallego url_pdf_valenciano url_eli departamento_codigo fecha_actualizacion analysis text text_xml
cat(colnames(retrieve_document("BORME-S-2022-1")))
## date sumario_nbo sumario_code section section_number emisor emisor_etq text publication
```

Many data is available to users which allows to analysis it:

- Date of approval, date of publication
- Department
- Type of publication
- Full text
- Legal status
- ...

]

.pull-right[

For example: looking at publications from the universities:

Almost half of the publications are due to people missing their degrees certificates.

]

.bottom[.center[More examples at: https://llrs.github.io/BOE_historico/]]

???

Of each document all the fields reported by the xml file can be retrieved in a tidy format, which allows for nice analysis, graphs and statistics.