Building Git Packages for R#
R packages can come in multiple formats:
- Source: A collection of directories and files containing source code.
- Bundle: A specially created tar file containing bundled source code. The
result of
R CMD build
. - Binary: A binary file specific to an operating system and architecture,
containing compiled source code. Not an executable. The result of
R CMD INSTALL
.
In some configurations, RStudio Package Manager will need to run R in order to transform a package from one state to another. RStudio Package Manager uses the RStudio Job Launcher to perform this task.
Job Launcher#
The RStudio Job Launcher is a service responsible for running jobs in support of
RStudio Package Manager. The Job Launcher is automatically installed when
RStudio Package Manager is installed. Settings for the Job Launcher can be
configured directly in the RStudio Package Manager configuration file,
/etc/rstudio-pm/rstudio-pm.gcfg
. See the configuration section in the appendix.
RStudio Package Manager can only use its own instance of the Job Launcher and runs all tasks locally. Other RStudio products may have separate Job Launcher instances managed independently from RStudio Package Manager. The RStudio Package Manager's Job Launcher cannot be shared by other products, nor can RStudio Package Manager use a Job Launcher instance set up for another product.
Server log messages related to this component can be shown by enabling the launcher
region. More information about activating log regions is in the configuration appendix.
R Configuration#
You can optionally specify the installation of R in the RStudio Package Manager configuration file:
; /etc/rstudio-pm/rstudio-pm.gcfg
[Server]
RVersion = /opt/R/4.0.5
Replace /opt/R/4.0.5
with the path to your R installation. Use the path to
the R installation directory, not the path to the binary (do not include
/bin/R
). While multiple versions of R may be installed on the server, only
one version of R may be specified for use by RStudio Package Manager.
Note
R must be configured in order for the Job Launcher to work correctly. For more information, refer to the Job Launcher section.
If the RVersion
field is included, then it must be valid, and it must only
appear once in the configuration file. Check the server log
after starting and stopping the RSPM process for messages relevant to the R
configuration.
R Installation#
RStudio recommends that in most cases you install R from pre-compiled binaries.
To install from pre-compiled binaries, follow the instructions at Install R.
Alternatively, you can install R from source, to do this follow the instructions at Install R from source.
Git Builders#
RStudio Package Manager defines a git-builder
as an entity that watches a
remote Git endpoint (e.g., git@github.com:user/example.git
) for changes and
builds R package bundles.
An administrator follows these steps:
- Create a git source.
- Create a
git-builder
for the source, specifying whether to watch for commits to a Git branch or tags in a Git repository. The endpoint can be HTTP or SSH (see below). See thecreate git-builder
command for full details, e.g., how to track a specific branch. - Based on the selection specified with the
create git-builder
command, RStudio Package Manager clones the Git endpoint and runs an R job to transform the Git clone into a package bundle. The package bundle is made available to any repositories subscribing to the source. - RStudio Package Manager polls the Git endpoint to watch for either new
commits or new tags (based on the selection specified with the
create git-builder
command). If an update is available, RStudio Package Manager automatically pulls the new changes and launches an R job. The R job creates a package bundle from the updated Git clone and updates the package available in the git source. Previous versions are archived. - Users install the package from the repository via
install.packages
NOTdevtools
.
See the Getting Started section for a specific example.
Server log messages related to this component can be shown by enabling the git
region.
More information about activating log regions is in the configuration appendix.
Access restricted Git endpoints using SSH keys#
If Git builders require authentication, RStudio Package Manager can use SSH keys to authenticate against the endpoint.
Begin by creating an SSH key and granting the SSH key access to the Git
endpoint. The specific steps will depend on your Git provider. Once you have the
path to the SSH key, use the import
command to securely name and store the SSH key for later use by
RStudio Package Manager. If desired, you can now remove the SSH key file.
Multiple keys can be imported.
To use the newly imported SSH key with a new Git builder, specify the key
name with the --ssh-key
flag in the create git-builder
command.
SSH Key Security#
RStudio Package Manager encrypts and stores imported SSH keys in the metadata
database. Any person (by default, members of the rstudio-pm
unix
group) with access to the admin CLI can:
- Associate an imported key with a Git builder using the
create git-builder
command - List the names of available SSH keys using the
list ssh-keys
command
Users cannot access the contents of the key, nor is the key available for arbitrary actions. We recommend granting SSH keys imported to RStudio Package Manager limited read-only access to only the endpoints you wish to expose as R packages.
When imported, the keys are encrypted at rest, during Git operations which require SSH, the keys are added to an ssh-agent and thus never written to the filesystem or written to STDIN.
Although RStudio Package Manager allows the use of SSH keys with no passphrase, it is still recommended to use a strong SSH key with a passphrase.
SSH keys may be rotated by creating a new SSH key and editing an existing git-builder:
Terminal
rspm import --name=[key name] --path=[/path/to/key]
rspm edit git-builder --name=[git-builder name] --source=[git source] --new-ssh-key=[key name]
Commits vs Tags#
A package based on a Git endpoint can can be configured to watch one of two types of changes: "commits" or "tags". In short, "commits" watches for changes to a specified Git branch, where "tags" watches for new tags in the whole Git repository. In more detail:
- Commits - RStudio Package Manager will update the package any time new
commits are discovered in a branch. In this mode, RStudio Package Manager automatically
modifies the package's version, assigning a unique version number to each
build. The version number is created based on the commit time-stamp and is
designed to avoid conflicts with the version scheme used by the package author. For
example, if the Description file for a package indicates a version of
1.1-3
, the automatic version number would be:1.1-3.0.0.0.1537204599
. If the author updates the package with a new commit, but keeps the version in the Description file the same, the new automatic version number would reflect the new commit time-stamp, e.g.1.1-3.0.0.0.1537218677
. This process ensures that users of the package always get the correct behavior frominstall.packages
, with newer commits being associated with a semantically higher version number.
The above version behavior for "commits" triggers may be overridden by using the
Git.ForceDescriptionVersion
configuration option. This will force all packages
built by commits in a branch to use the exact version in the DESCRIPTION file.
- Tags - RStudio Package Manager will update the package any time a new Git
tag is discovered. In this mode, RStudio Package Manager retains the version
specified in the package's Description file. This mode is designed to work when
a Git tag is used to indicate a package release. Note: The name of the tag must
match the version in the Description file. For example, if your package's
Description file has
Version: 5.4.2
your tag must be either5.4.2
orv5.4.2
. If two tags reference the same version, preference is given to the newer tag. If a newer tag references an older version than a prior tag, the new tag is built as an archived package. If a tag is removed from a Git endpoint, any packages already built for that tag remain.
Commit mode is recommended for bleeding edge repositories, whereas tag mode is suitable for exposing stable releases of packages.
A git source can support different packages with different modes. However, a given package can only have one mode in a source. If you would like to surface the same package in both commit and tag mode, you must create two git sources.
Git directories#
By default, packages will be built from the git root directory. If the R package
exists in a different location, it can be specified using the --sub-dir
flag when adding a git package.
Managing Packages from Git#
The git-builder
(described above) watches the Git endpoint for changes, automatically
handling package updates and archives. There might be cases where you wish to remove
packages, or to stop package building altogether.
Packages can be removed at any time using the remove
command.
To stop automatic package building, but keep the existing packages, use the
delete git-builder
command. To resume package building, simply create a new
git-builder
with the same metadata.
Terminal
# To remove previously-built packages from git:
rspm remove --source=[name of source] --name=[name of package and scope]
# To stop automatic package building, but keep the packages:
rspm delete git-builder --name=[name of package] --source=[name of source]
To view information about the current Git endpoints that are being tracked, use:
Terminal
rspm list git-builders
Editing git-builders#
Git builders have a few fields which may be edited: ssh-keys, URLs, sub directories, and branches (for "commits" triggers). The git-builders cannot be changed from SSH to HTTP URLs or vice versa.
Terminal
rspm edit git-builder --name=[git-builder name] --source=[git source name] --new-url=[HTTP/SSH URL]
rspm edit git-builder --name=[git-builder name] --source=[git source name] --new-ssh-key=[SSH key name]
rspm edit git-builder --name=[git-builder name] --source=[git source name] --new-branch=[branch name]
rspm edit git-builder --name=[git-builder name] --source=[git source name] --new-sub-dir=[sub directory path]
rspm edit git-builder --name=[git-builder name] --source=[git source name] --remove-sub-dir
Combining packages from Git(Hub) with other package sources#
Local packages cannot be added manually to a git source, but a repository can surface packages from a git source alongside local packages and CRAN packages by subscribing to multiple sources. Take care when managing a repository's subscriptions as order is important, see the Multiple Sources section.
Polling Frequency#
You can control how frequently RStudio Package Manager checks for updates using
the Git.PollInterval
configuration field. If
multiple commits occur between checks, RStudio Package Manager will create a
single version representing all of the changes. If multiple tags are created or
removed between checks, RStudio Package Manager will build each tag
individually, automatically archiving tags representing older versions of the
package.
Repository Versioning is identical in all source types, including git sources.
Tracking Changes and Errors#
If a repository subscribes to a git source, you can view the git source's history in the Activity Log. The Activity Log will identify each change to a package including the new version, and a message will indicate the associated Git tag or commit as appropriate. If an error is encountered attempting to clone, poll, or bundle a package, the Activity Log will record the attempt and include a message with the CLI command to be run to view a full error log.
You can also use the following RSPM CLI commands to quickly check your active Git builders and view the logs:
Terminal
$ rspm list git-builders
<< Git Builders:
<< - [git package name]
<< Source: [source name]
<< URL: [source url]
<< Trigger: [git package trigger]
<< Key: none
Terminal
$ rspm list git-builds --source=[source name] --name=[git package name]
<< Git Builds:
<< - [git package name]
<< Transaction ID: [transaction ID]
<< SHA: [SHA]
<< Tag: [tag]
<< Status: [job status]
<< Time: [time of run]
<< Only showing latest build, for more builds use the --count and --page flags
<< For more information run: rspm logs --transaction-id=[transaction ID]
Terminal
$ rspm logs --transaction-id=[transaction ID]
<< ...
<< [git package run logs]
<< ...
RStudio Package Manager automatically tries to build updates from a Git source three times. If the build fails more than three times, the update causing the failure is ignored. New updates are still discovered and built.
To retry a failed update, or to force a Git builder to rebuild the latest package
version, use the rerun
command:
Terminal
rspm rerun git-builder \
--name=[package name] \
--source=[source name] \
--tag=[tag to rebuild, only required if the build trigger is tags]
To aid in debugging, it can help to view output from the git commands
that are run as well as output from the SSH connection when applicable.
To enable debugging, refer to the Debug.Log
configuration property in
the configuration appendix.
To enable the debug log temporarily without restarting the server use the
rspm config
command:
Terminal
rspm config debug logger activate git
Process Management#
See the process management section for information on how RStudio Package Manager securely runs R processes when building R packages for git sources.