In this vignette, we discuss how to write file specifications and use them to create a file collection. They help define the file packing scope in the source R package.

File specification

In pkglite, a file specification defines the parameters to locate the files matching specific criteria under an R package. One can use file_spec() to create a file specification.

For example, to match the .R files under the R/ folder, use

fs <- file_spec(
  "R/",
  pattern = "\\.R$", format = "text",
  recursive = FALSE, ignore_case = TRUE, all_files = FALSE
)

fs
── File specification ──────────────────────────────────────────────────────────
● Relative path: R/
● Pattern: '\\.R$'
● Format: 'text'
● Recursive: FALSE
● Ignore case: TRUE
● All files: FALSE

File collection

A file collection is generated by evaluating file specification(s) under a package directory. It contains the metadata of the list of matching files. Use collate() to create a file collection:

pkg <- system.file("examples/pkg1/", package = "pkglite")
pkg %>% collate(fs)
── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
          path_rel format
1         R/data.R   text
2        R/hello.R   text
3 R/pkg1-package.R   text

File specification templates

We have included a few file specifications to cover the common file structures in an R package. See ?file_spec_templates for details. We will use some of them to demonstrate how to combine them to cover an entire package.

File specification usage patterns

To generate a file collection that includes a core set of files under the package root, use

── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
     path_rel format
1 DESCRIPTION   text
2   NAMESPACE   text
3     NEWS.md   text
4   README.md   text

To include all files under the package root, use

── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
     path_rel format
1 DESCRIPTION   text
2   NAMESPACE   text
3     NEWS.md   text
4   README.md   text

Sometimes, a file collection might contain files or directories that should always be excluded, such as the files defined in pattern_file_sanitize(). One could use sanitize_file_collection() to remove such items from the file collection:

── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
     path_rel format
1 DESCRIPTION   text
2   NAMESPACE   text
3     NEWS.md   text
4   README.md   text

We can feed one or more file specifications to collate(). The union of the matched files will be returned:

pkg %>% collate(file_r(), file_man())
── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
              path_rel format
1             R/data.R   text
2            R/hello.R   text
3     R/pkg1-package.R   text
4        R/sysdata.rda binary
5       man/dataset.Rd   text
6   man/hello_world.Rd   text
7  man/pkg1-package.Rd   text
8 man/figures/logo.png binary

If file specification did not match any files, an empty file collection is returned:

pkg %>% collate(file_src())
── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
[1] path_rel format  
<0 rows> (or 0-length row.names)

Naturally, this would not add additional files to the collection:

pkg %>% collate(file_r(), file_man(), file_src())
── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
              path_rel format
1             R/data.R   text
2            R/hello.R   text
3     R/pkg1-package.R   text
4        R/sysdata.rda binary
5       man/dataset.Rd   text
6   man/hello_world.Rd   text
7  man/pkg1-package.Rd   text
8 man/figures/logo.png binary

Default file specification

file_default() offers a default combination of the file specification templates.

── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
                path_rel format
1            DESCRIPTION   text
2              NAMESPACE   text
3                NEWS.md   text
4              README.md   text
5               R/data.R   text
6              R/hello.R   text
7       R/pkg1-package.R   text
8          R/sysdata.rda binary
9         man/dataset.Rd   text
10    man/hello_world.Rd   text
11   man/pkg1-package.Rd   text
12  man/figures/logo.png binary
13 vignettes/example.bib   text
14    vignettes/pkg1.Rmd   text
15      data/dataset.rda binary

Automatic file specification

file_auto() provides a specification that lists all files (with an extension) under a folder recursively. It also guesses the file format type based on the file extension. This is useful for directories like inst/ that do not share a standard structure or filename pattern across packages.

pkg %>% collate(file_auto("inst/"))
── File collection ─────────────────────────────────────────────────────────────
── Package: pkg1 ───────────────────────────────────────────────────────────────
                  path_rel format
1 inst/extdata/dataset.tsv   text