☰ 🎬 Datalayer GitHub Actions

This repository contains reusable GitHub Actions for Datalayer workflows.

datalayer-evals

The datalayer-evals action runs Datalayer eval reports in CI and produces report artifacts.

It uses the datalayer-core Python API directly (the DatalayerClient and the core eval-report helpers) rather than shelling out to the CLI, so the generated reports contain the full structured failure diagnostics (per-run failure causes, stages, types and detail excerpts). Failures are also aggregated into the GitHub step summary and exposed as action outputs.

It supports two execution modes:

Primary report mode (single evalset).
Comparison mode (primary + secondary evalsets) with a generated summary markdown.

It also supports optional multi-agentspec runtime bootstrap before reporting via:

agentspec-id (single)
agentspec-ids (comma-separated list)

Evalsets can be provided as IDs, or created on the fly from spec files.

Primary report mode produces, for each report:

<output-markdown> and a matching .csv (when export-csv is true)
a <output-markdown>.log artifact containing the full structured report JSON (including every run-level failure cause)
timestamped files report-<timestamp>.md and report-<timestamp>.csv

The action is implemented in Python and can be consumed from other repositories.

Secrets

Authentication and billing context are supplied through repository secrets so they never appear in workflow files or logs:

Secret	Maps to input	Required	Purpose
`DATALAYER_API_KEY`	`api-key`	✅ Required	Authenticates every call the action makes to Datalayer.
`DATALAYER_BILLABLE_ACCOUNT_UID`	`billable-account-uid`	Optional	Billable account context used for eval operations and any optional runtime creation.

Reference them in the consumer workflow:

with:
  api-key: ${{ secrets.DATALAYER_API_KEY }}
  billable-account-uid: ${{ secrets.DATALAYER_BILLABLE_ACCOUNT_UID }}

To make the billable account optional at dispatch time while still defaulting to the secret, use the || fallback:

  billable-account-uid: ${{ inputs.billable_account_uid || secrets.DATALAYER_BILLABLE_ACCOUNT_UID }}

Report Upload

When upload-report-artifacts is true (the default), the action uploads the generated markdown and CSV reports (plus the structured .log JSON and any timestamped/secondary/comparison files) as a build artifact in a final step — no extra upload step is required in the consumer workflow.

Inputs

evalset-id: required, target evalset UID
evalset-spec-file: optional, path to primary evalset spec JSON; action creates evalset and reports it
secondary-evalset-id: optional, secondary evalset UID
secondary-evalset-spec-file: optional, path to secondary evalset spec JSON
api-key: required, Datalayer API key
ai-agents-url: optional, override API URL
billable-account-uid: optional, billable account UID context for eval operations and optional runtime creation

When billable-account-uid is omitted (or empty), the action does not force a billing override and calls run in the default account context for the API key.

run-limit: optional, default 50
output-markdown: optional, default evals-report.md
secondary-output-markdown: optional, output file for secondary report
comparison-summary-output: optional, output file for secondary-vs-primary summary
export-csv: optional, default true
upload-report-artifacts: optional, default true; uploads generated markdown/csv/log artifacts in a final step
report-artifact-name: optional, default datalayer-evals-reports
iam-url: optional, IAM URL override used when creating the optional agent runtime
runtimes-url: optional, Runtimes URL override used when creating the optional agent runtime
agentspec-id: optional, create an agent runtime before reporting using this spec id (default example-simple)
agentspec-ids: optional, comma-separated list of spec ids for multi-runtime bootstrap before reporting
agentspec: optional, URL or local file path to YAML/JSON agent spec; mutually exclusive with agentspec-id
agent-environment-name: optional, default ai-agents-env
agent-given-name: optional runtime name for the created agent runtime
agent-time-reservation: optional runtime reservation in minutes, default 10

When the action creates a runtime via agentspec-id or agentspec, it automatically tears the runtime down after report generation (including early-exit paths).

At least one of evalset-id or evalset-spec-file must be provided.

Outputs

report-file: markdown report file path
csv-file: CSV report file path (empty when export-csv=false)
log-file: full structured report JSON log file path (captures all failure causes)
timestamped_report_file: timestamped markdown path
timestamped_csv_file: timestamped CSV path
secondary-report-file: secondary markdown report path
secondary-csv-file: secondary CSV report path
secondary-log-file: secondary structured report JSON log
secondary-timestamped-report-file: secondary timestamped markdown
secondary-timestamped-csv-file: secondary timestamped CSV
comparison-summary-file: generated comparison summary markdown
agent-runtime-pod-name: pod name of runtime optionally created through the core client
agent-runtime-ingress: ingress URL of that optional runtime
agent-runtime-pod-names: JSON array of pod names created through agentspec-ids
agent-runtime-ingresses: JSON array of ingress URLs created through agentspec-ids
failed-run-count: total number of failed runs across primary and secondary reports
primary-failed-run-count: number of failed runs in the primary report
secondary-failed-run-count: number of failed runs in the secondary report

Use From Another Repository

Example workflow step (single evalset):

uses: datalayer/github-actions@v1 with: evalset-id: 01KXXXXXXXXXXXX api-key: ${{ secrets.DATALAYER_API_KEY }} run-limit: "50" output-markdown: artifacts/evals-report.md export-csv: "true"

Example workflow step with runtime bootstrap from spec id before report:

uses: datalayer/github-actions@v1 with: evalset-id: 01KXXXXXXXXXXXX api-key: ${{ secrets.DATALAYER_API_KEY }} agentspec-id: example-simple agent-environment-name: ai-agents-env agent-time-reservation: "10" billable-account-uid: ${{ secrets.DATALAYER_BILLABLE_ACCOUNT_UID }} output-markdown: artifacts/evals-report.md export-csv: "true"

Example workflow step with multi-agentspec bootstrap:

uses: datalayer/github-actions@v1 with: evalset-id: 01KXXXXXXXXXXXX api-key: ${{ secrets.DATALAYER_API_KEY }} agentspec-ids: example-evals,example-evals-nocodemode agent-environment-name: ai-agents-env agent-time-reservation: "10" output-markdown: artifacts/evals-report.md export-csv: "true"

The action now includes a final upload step by default (upload-report-artifacts=true) that publishes markdown/csv/log artifacts.

To disable built-in upload and manage upload yourself:

uses: datalayer/github-actions@v1 with: evalset-id: 01KXXXXXXXXXXXX api-key: ${{ secrets.DATALAYER_API_KEY }} upload-report-artifacts: "false"

Example workflow step (two spec files, one comparison run):

uses: datalayer/github-actions@v1 with: evalset-spec-file: .github/evals/no-codemode.evalset.json secondary-evalset-spec-file: .github/evals/codemode.evalset.json api-key: ${{ secrets.DATALAYER_API_KEY }} output-markdown: artifacts/no-codemode-report.md secondary-output-markdown: artifacts/codemode-report.md comparison-summary-output: artifacts/comparison-summary.md export-csv: "true"

Upload artifacts in the consumer workflow:

uses: actions/upload-artifact@v4 with: name: evals-report path: | artifacts/evals-report.md artifacts/evals-report.csv artifacts/evals-report.md.log

For two-spec comparison mode, also upload:

	artifacts/no-codemode-report.md
	artifacts/no-codemode-report.csv
	artifacts/no-codemode-report.md.log
	artifacts/codemode-report.md
	artifacts/codemode-report.csv
	artifacts/codemode-report.md.log
	artifacts/comparison-summary.md

Publish New Versions

Commit and push changes to main.
Tag a version.
Push the tag.

Commands:

git tag -a v1.0.0 -m "datalayer-evals v1.0.0" git push origin v1.0.0

Recommended tag strategy:

Maintain a moving major tag for stable consumers.
Example:
- v1.0.0 immutable release tag
- v1 moving major tag

Move major tag:

git tag -f v1 v1.0.0 git push -f origin v1

Consumers should reference v1 for stable updates, or pin an immutable tag for strict reproducibility.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
action.yml		action.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

☰ 🎬 Datalayer GitHub Actions

datalayer-evals

Secrets

Report Upload

Inputs

Outputs

Use From Another Repository

Publish New Versions

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

☰ 🎬 Datalayer GitHub Actions

datalayer-evals

Secrets

Report Upload

Inputs

Outputs

Use From Another Repository

Publish New Versions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages