Support compression for Actions logs to save storage space and
bandwidth. Inspired by
https://github.com/go-gitea/gitea/issues/24256#issuecomment-1521153015
The biggest challenge is that the compression format should support
[seekable](https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md).
So when users are viewing a part of the log lines, Gitea doesn't need to
download the whole compressed file and decompress it.
That means gzip cannot help here. And I did research, there aren't too
many choices, like bgzip and xz, but I think zstd is the most popular
one. It has an implementation in Golang with
[zstd](https://github.com/klauspost/compress/tree/master/zstd) and
[zstd-seekable-format-go](https://github.com/SaveTheRbtz/zstd-seekable-format-go),
and what is better is that it has good compatibility: a seekable format
zstd file can be read by a regular zstd reader.
This PR introduces a new package `zstd` to combine and wrap the two
packages, to provide a unified and easy-to-use API.
And a new setting `LOG_COMPRESSION` is added to the config, although I
don't see any reason why not to use compression, I think's it's a good
idea to keep the default with `none` to be consistent with old versions.
`LOG_COMPRESSION` takes effect for only new log files, it adds `.zst` as
an extension to the file name, so Gitea can determine if it needs
decompression according to the file name when reading. Old files will
keep the format since it's not worth converting them, as they will be
cleared after #31735.
<img width="541" alt="image"
src="https://github.com/user-attachments/assets/e9598764-a4e0-4b68-8c2b-f769265183c9">
Fix#31657.
According to the
[doc](https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#onschedule)
of GitHub Actions, The timezone for cron should be UTC, not the local
timezone. And Gitea Actions doesn't have any reasons to change this, so
I think it's a bug.
However, Gitea Actions has extended the syntax, as it supports
descriptors like `@weekly` and `@every 5m`, and supports specifying the
timezone like `TZ=UTC 0 10 * * *`. So we can make it use UTC only when
the timezone is not specified, to be compatible with GitHub Actions, and
also respect the user's specified.
It does break the feature because the times to run tasks would be
changed, and it may confuse users. So I don't think we should backport
this.
## ⚠️ BREAKING ⚠️
If the server's local time zone is not UTC, a scheduled task would run
at a different time after upgrading Gitea to this version.
Fix#31707.
Also related to #31715.
Some Actions resources could has different types of ownership. It could
be:
- global: all repos and orgs/users can use it.
- org/user level: only the org/user can use it.
- repo level: only the repo can use it.
There are two ways to distinguish org/user level from repo level:
1. `{owner_id: 1, repo_id: 2}` for repo level, and `{owner_id: 1,
repo_id: 0}` for org level.
2. `{owner_id: 0, repo_id: 2}` for repo level, and `{owner_id: 1,
repo_id: 0}` for org level.
The first way seems more reasonable, but it may not be true. The point
is that although a resource, like a runner, belongs to a repo (it can be
used by the repo), the runner doesn't belong to the repo's org (other
repos in the same org cannot use the runner). So, the second method
makes more sense.
And the first way is not user-friendly to query, we must set the repo id
to zero to avoid wrong results.
So, #31715 should be right. And the most simple way to fix#31707 is
just:
```diff
- shared.GetRegistrationToken(ctx, ctx.Repo.Repository.OwnerID, ctx.Repo.Repository.ID)
+ shared.GetRegistrationToken(ctx, 0, ctx.Repo.Repository.ID)
```
However, it is quite intuitive to set both owner id and repo id since
the repo belongs to the owner. So I prefer to be compatible with it. If
we get both owner id and repo id not zero when creating or finding, it's
very clear that the caller want one with repo level, but set owner id
accidentally. So it's OK to accept it but fix the owner id to zero.
Misspell 0.5.0 supports passing a csv file to extend the list of
misspellings, so I added some common ones from the codebase. There is at
least one typo in a API response so we need to decided whether to revert
that and then likely remove the dict entry.
Follow #29468
1. Interpolate runs-on with variables when scheduling tasks.
2. The `GetVariablesOfRun` function will check if the `Repo` of the run
is nil.
---------
Co-authored-by: Giteabot <teabot@gitea.io>
![image](https://github.com/go-gitea/gitea/assets/18380374/ddf6ee84-2242-49b9-b066-bd8429ba4d76)
When repo is a mirror, and commit author is an external user, then
`GetUserByEmail` will return error.
reproduce/test:
- mirror Gitea to your instance
- disable action and enable it again, this will trigger
`DetectAndHandleSchedules`
ps: also follow #24706, it only fixed normal runs, not scheduled runs.
Fix#30243
We only checking unit disabled when detecting workflows, but not in
runner `FetchTask`.
So if a workflow was detected when action unit is enabled, but disabled
later, `FetchTask` will still return these detected actions.
Global setting: repo.ENABLED and repository.`DISABLED_REPO_UNITS` will
not effect this.
Clarify when "string" should be used (and be escaped), and when
"template.HTML" should be used (no need to escape)
And help PRs like #29059 , to render the error messages correctly.
Fix#28157
This PR fix the possible bugs about actions schedule.
## The Changes
- Move `UpdateRepositoryUnit` and `SetRepoDefaultBranch` from models to
service layer
- Remove schedules plan from database and cancel waiting & running
schedules tasks in this repository when actions unit has been disabled
or global disabled.
- Remove schedules plan from database and cancel waiting & running
schedules tasks in this repository when default branch changed.
Introduce the new generic deletion methods
- `func DeleteByID[T any](ctx context.Context, id int64) (int64, error)`
- `func DeleteByIDs[T any](ctx context.Context, ids ...int64) error`
- `func Delete[T any](ctx context.Context, opts FindOptions) (int64,
error)`
So, we no longer need any specific deletion method and can just use
the generic ones instead.
Replacement of #28450Closes#28450
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
When we pick up a job, all waiting jobs should firstly be ordered by
update time,
otherwise when there's a running job, if I rerun an older job, the older
job will run first, as it's id is smaller.
- On user deletion, delete action runners that the user has created.
- Add a database consistency check to remove action runners that have
nonexistent belonging owner.
- Resolves https://codeberg.org/forgejo/forgejo/issues/1720
(cherry picked from commit 009ca7223dab054f7f760b7ccae69e745eebfabb)
Co-authored-by: Gusted <postmaster@gusted.xyz>
Partially Fix#25041
This PR redefined the meaning of column `is_active` in table
`action_runner_token`.
Before this PR, `is_active` means whether it has been used by any
runner. If it's true, other runner cannot use it to register again.
In this PR, `is_active` means whether it's validated to be used to
register runner. And if it's true, then it can be used to register
runners until it become false. When creating a new `is_active` register
token, any previous tokens will be set `is_active` to false.
Currently, Artifact does not have an expiration and automatic cleanup
mechanism, and this feature needs to be added. It contains the following
key points:
- [x] add global artifact retention days option in config file. Default
value is 90 days.
- [x] add cron task to clean up expired artifacts. It should run once a
day.
- [x] support custom retention period from `retention-days: 5` in
`upload-artifact@v3`.
- [x] artifacts link in actions view should be non-clickable text when
expired.
Replace #22751
1. only support the default branch in the repository setting.
2. autoload schedule data from the schedule table after starting the
service.
3. support specific syntax like `@yearly`, `@monthly`, `@weekly`,
`@daily`, `@hourly`
## How to use
See the [GitHub Actions
document](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#schedule)
for getting more detailed information.
```yaml
on:
schedule:
- cron: '30 5 * * 1,3'
- cron: '30 5 * * 2,4'
jobs:
test_schedule:
runs-on: ubuntu-latest
steps:
- name: Not on Monday or Wednesday
if: github.event.schedule != '30 5 * * 1,3'
run: echo "This step will be skipped on Monday and Wednesday"
- name: Every time
run: echo "This step will always run"
```
Signed-off-by: Bo-Yi.Wu <appleboy.tw@gmail.com>
---------
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
- cancel running jobs if the event is push
- Add a new function `CancelRunningJobs` to cancel all running jobs of a
run
- Update `FindRunOptions` struct to include `Ref` field and update its
condition in `toConds` function
- Implement auto cancellation of running jobs in the same workflow in
`notify` function
related task: https://github.com/go-gitea/gitea/pull/22751/
---------
Signed-off-by: Bo-Yi Wu <appleboy.tw@gmail.com>
Signed-off-by: appleboy <appleboy.tw@gmail.com>
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: delvh <dev.lh@web.de>
Close#24544
Changes:
- Create `action_tasks_version` table to store the latest version of
each scope (global, org and repo).
- When a job with the status of `waiting` is created, the tasks version
of the scopes it belongs to will increase.
- When the status of a job already in the database is updated to
`waiting`, the tasks version of the scopes it belongs to will increase.
- On Gitea side, in `FeatchTask()`, will try to query the
`action_tasks_version` record of the scope of the runner that call
`FetchTask()`. If the record does not exist, will insert a row. Then,
Gitea will compare the version passed from runner to Gitea with the
version in database, if inconsistent, try pick task. Gitea always
returns the latest version from database to the runner.
Related:
- Protocol: https://gitea.com/gitea/actions-proto-def/pulls/10
- Runner: https://gitea.com/gitea/act_runner/pulls/219
current actions artifacts implementation only support single file
artifact. To support multiple files uploading, it needs:
- save each file to each db record with same run-id, same artifact-name
and proper artifact-path
- need change artifact uploading url without artifact-id, multiple files
creates multiple artifact-ids
- support `path` in download-artifact action. artifact should download
to `{path}/{artifact-path}`.
- in repo action view, it provides zip download link in artifacts list
in summary page, no matter this artifact contains single or multiple
files.
Fix#25451.
Bugfixes:
- When stopping the zombie or endless tasks, set `LogInStorage` to true
after transferring the file to storage. It was missing, it could write
to a nonexistent file in DBFS because `LogInStorage` was false.
- Always update `ActionTask.Updated` when there's a new state reported
by the runner, even if there's no change. This is to avoid the task
being judged as a zombie task.
Enhancement:
- Support `Stat()` for DBFS file.
- `WriteLogs` refuses to write if it could result in content holes.
---------
Co-authored-by: Giteabot <teabot@gitea.io>
Fix#25088
This PR adds the support for
[`pull_request_target`](https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target)
workflow trigger. `pull_request_target` is similar to `pull_request`,
but the workflow triggered by the `pull_request_target` event runs in
the context of the base branch of the pull request rather than the head
branch. Since the workflow from the base is considered trusted, it can
access the secrets and doesn't need approvals to run.