element-synapse

mirror of https://github.com/element-hq/synapse.git synced 2025-12-28 06:47:37 +00:00

Author	SHA1	Message	Date
Erik Johnston	dfd00a986f	Fix sliding sync performance slow down for long lived connections. (#19206 ) Fixes https://github.com/element-hq/synapse/issues/19175 This PR moves tracking of what lazy loaded membership we've sent to each room out of the required state table. This avoids that table from continuously growing, which massively helps performance as we pull out all matching rows for the connection when we receive a request. The new table is only read when we have data in a room to send, so we end up reading a lot fewer rows from the DB. Though we now read from that table for every room we have events to return in, rather than once at the start of the request. For an explanation of how the new table works, see the [comment](https://github.com/element-hq/synapse/blob/erikj/sss_better_membership_storage2/synapse/storage/schema/main/delta/93/02_sliding_sync_members.sql#L15-L38) on the table schema. The table is designed so that we can later prune old entries if we wish, but that is not implemented in this PR. Reviewable commit-by-commit. --------- Co-authored-by: Eric Eastwood <erice@element.io>	2025-12-12 10:02:57 +00:00
Erik Johnston	1bddd25a85	Port `Clock` functions to use `Duration` class (#19229 ) This changes the arguments in clock functions to be `Duration` and converts call sites and constants into `Duration`. There are still some more functions around that should be converted (e.g. `timeout_deferred`), but we leave that to another PR. We also changes `.as_secs()` to return a float, as the rounding broke things subtly. The only reason to keep it (its the same as `timedelta.total_seconds()`) is for symmetry with `as_millis()`. Follows on from https://github.com/element-hq/synapse/pull/19223	2025-12-01 13:55:06 +00:00
Andrew Morgan	703464c1f7	Fix case where `get_partial_current_state_deltas` could return >100 rows (#18960 )	2025-11-26 17:17:04 +00:00
Richard van der Hoff	c928347779	Implement MSC4380: Invite blocking (#19203 ) MSC4380 aims to be a simplified implementation of MSC4155; the hope is that we can get it specced and rolled out rapidly, so that we can resolve the fact that `matrix.org` has enabled MSC4155. The implementation leans heavily on what's already there for MSC4155. It has its own `experimental_features` flag. If both MSC4155 and MSC4380 are enabled, and a user has both configurations set, then we prioritise the MSC4380 one. Contributed wearing my 🎩 Spec Core Team hat.	2025-11-26 16:12:14 +00:00
Erik Johnston	b74c29f694	Move towards a dedicated `Duration` class (#19223 ) We have various constants to try and avoid mistyping of durations, e.g. `ONE_HOUR_SECONDS * MILLISECONDS_PER_SECOND`, however this can get a little verbose and doesn't help with typing. Instead, let's move towards a dedicated `Duration` class (basically a [`timedelta`](https://docs.python.org/3/library/datetime.html#timedelta-objects) with helper methods). This PR introduces the new types and converts all usages of the existing constants with it. Future PRs may work to move the clock methods to also use it (e.g. `call_later` and `looping_call`). Reviewable commit-by-commit.	2025-11-26 10:56:59 +00:00
Andrew Ferrazzutti	fcac7e0282	Write union types as `X \| Y` where possible (#19111 ) aka PEP 604, added in Python 3.10	2025-11-06 14:02:33 -06:00
Andrew Ferrazzutti	fc244bb592	Use type hinting generics in standard collections (#19046 ) aka PEP 585, added in Python 3.9 - https://peps.python.org/pep-0585/ - https://docs.astral.sh/ruff/rules/non-pep585-annotation/	2025-10-22 16:48:19 -05:00
Eric Eastwood	d2c582ef3c	Move unique snowflake homeserver background tasks to `start_background_tasks` (#19037 ) (the standard pattern for this kind of thing)	2025-10-13 10:19:09 -05:00
Eric Eastwood	5adb08f3c9	Remove `MockClock()` (#18992 ) Spawning from adding some logcontext debug logs in https://github.com/element-hq/synapse/pull/18966 and since we're not logging at the `set_current_context(...)` level (see reasoning there), this removes some usage of `set_current_context(...)`. Specifically, `MockClock.call_later(...)` doesn't handle logcontexts correctly. It uses the calling logcontext as the callback context (wrong, as the logcontext could finish before the callback finishes) and it didn't reset back to the sentinel context before handing back to the reactor. It was like this since it was [introduced 10+ years ago](`38da9884e7`). Instead of fixing the implementation which would just be a copy of our normal `Clock`, we can just remove `MockClock`	2025-09-30 11:27:29 -05:00
Eric Eastwood	5143f93dc9	Fix `server_name` in logging context for multiple Synapse instances in one process (#18868 ) ### Background As part of Element's plan to support a light form of vhosting (virtual host) (multiple instances of Synapse in the same Python process), we're currently diving into the details and implications of running multiple instances of Synapse in the same Python process. "Per-tenant logging" tracked internally by https://github.com/element-hq/synapse-small-hosts/issues/48 ### Prior art Previously, we exposed `server_name` by providing a static logging `MetadataFilter` that injected the values: `205d9e4fc4/synapse/config/logger.py (L216)` While this can work fine for the normal case of one Synapse instance per Python process, this configures things globally and isn't compatible when we try to start multiple Synapse instances because each subsequent tenant will overwrite the previous tenant. ### What does this PR do? We remove the `MetadataFilter` and replace it by tracking the `server_name` in the `LoggingContext` and expose it with our existing [`LoggingContextFilter`](`205d9e4fc4/synapse/logging/context.py (L584-L622)`) that we already use to expose information about the `request`. This means that the `server_name` value follows wherever we log as expected even when we have multiple Synapse instances running in the same process. ### A note on logcontext Anywhere, Synapse mistakenly uses the `sentinel` logcontext to log something, we won't know which server sent the log. We've been fixing up `sentinel` logcontext usage as tracked by https://github.com/element-hq/synapse/issues/18905 Any further `sentinel` logcontext usage we find in the future can be fixed piecemeal as normal. `d2a966f922/docs/log_contexts.md (L71-L81)` ### Testing strategy 1. Adjust your logging config to include `%(server_name)s` in the format ```yaml formatters: precise: format: '%(asctime)s - %(server_name)s - %(name)s - %(lineno)d - %(levelname)s - %(request)s - %(message)s' ``` 1. Start Synapse: `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Make some requests (`curl http://localhost:8008/_matrix/client/versions`, etc) 1. Open the homeserver logs and notice the `server_name` in the logs as expected. `unknown_server_from_sentinel_context` is expected for the `sentinel` logcontext (things outside of Synapse).	2025-09-26 17:10:48 -05:00
Andrew Morgan	ddc7627b22	Fix performance regression related to delayed events processing (#18926 )	2025-09-23 09:47:30 +01:00
Eric Eastwood	5a9ca1e3d9	Introduce `Clock.call_when_running(...)` to include logcontext by default (#18944 ) Introduce `Clock.call_when_running(...)` to wrap startup code in a logcontext, ensuring we can identify which server generated the logs. Background: > Ideally, nothing from the Synapse homeserver would be logged against the `sentinel` > logcontext as we want to know which server the logs came from. In practice, this is not > always the case yet especially outside of request handling. > > Global things outside of Synapse (e.g. Twisted reactor code) should run in the > `sentinel` logcontext. It's only when it calls into application code that a logcontext > gets activated. This means the reactor should be started in the `sentinel` logcontext, > and any time an awaitable yields control back to the reactor, it should reset the > logcontext to be the `sentinel` logcontext. This is important to avoid leaking the > current logcontext to the reactor (which would then get picked up and associated with > the next thing the reactor does). > > *-- `docs/log_contexts.md` Also adds a lint to prefer `Clock.call_when_running(...)` over `reactor.callWhenRunning(...)` Part of https://github.com/element-hq/synapse/issues/18905	2025-09-22 10:27:59 -05:00
reivilibre	ada3a3b2b3	Add experimental support for MSC4308: Thread Subscriptions extension to Sliding Sync when MSC4306 and MSC4186 are enabled. (#18695 ) Some checks are pending Schema / Ensure generated documentation is up-to-date (push) Waiting to run Details Tests / changes (push) Waiting to run Details Tests / check-sampleconfig (push) Blocked by required conditions Details Tests / check-schema-delta (push) Blocked by required conditions Details Tests / check-lockfile (push) Waiting to run Details Tests / lint (push) Blocked by required conditions Details Tests / Typechecking (push) Blocked by required conditions Details Tests / lint-crlf (push) Waiting to run Details Tests / lint-newsfile (push) Waiting to run Details Tests / lint-pydantic (push) Blocked by required conditions Details Tests / lint-clippy (push) Blocked by required conditions Details Tests / lint-clippy-nightly (push) Blocked by required conditions Details Tests / lint-rust (push) Blocked by required conditions Details Tests / lint-rustfmt (push) Blocked by required conditions Details Tests / lint-readme (push) Blocked by required conditions Details Tests / linting-done (push) Blocked by required conditions Details Tests / calculate-test-jobs (push) Blocked by required conditions Details Tests / trial (push) Blocked by required conditions Details Tests / trial-olddeps (push) Blocked by required conditions Details Tests / trial-pypy (all, pypy-3.9) (push) Blocked by required conditions Details Tests / sytest (push) Blocked by required conditions Details Tests / export-data (push) Blocked by required conditions Details Tests / portdb (13, 3.9) (push) Blocked by required conditions Details Tests / portdb (17, 3.13) (push) Blocked by required conditions Details Tests / complement (monolith, Postgres) (push) Blocked by required conditions Details Tests / complement (monolith, SQLite) (push) Blocked by required conditions Details Tests / complement (workers, Postgres) (push) Blocked by required conditions Details Tests / cargo-test (push) Blocked by required conditions Details Tests / cargo-bench (push) Blocked by required conditions Details Tests / tests-done (push) Blocked by required conditions Details Closes: #18436 Implements: https://github.com/matrix-org/matrix-spec-proposals/pull/4308 Follows: #18674 Adds an extension to Sliding Sync and a companion endpoint needed for backpaginating missed thread subscription changes, as described in MSC4308 --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org> Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2025-09-11 14:45:04 +01:00
Devon Hudson	9301baa5f8	Fix hydra tests	2025-08-11 11:32:57 -06:00
Devon Hudson	bd8f12f9c6	Fix broken test	2025-08-11 16:43:45 +01:00
Kegan Dougal	0eb7252a23	Support for room version 12	2025-08-11 16:43:45 +01:00
reivilibre	6514381b02	Implement the push rules for experimental MSC4306: Thread Subscriptions. (#18762 ) Follows: #18756 Implements: MSC4306 --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org> Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2025-08-06 15:33:52 +01:00
reivilibre	8306cee06a	Update implementation of MSC4306: Thread Subscriptions to include automatic subscription conflict prevention as introduced in later drafts. (#18756 ) Follows: #18674 Implements new drafts of MSC4306 --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org> Co-authored-by: Eric Eastwood <erice@element.io>	2025-08-05 18:22:53 +00:00
reivilibre	a31d53b28f	Use `twisted.internet.testing` module in tests instead of deprecated `twisted.test.proto_helpers`. (#18728 ) Follows: #18727 --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>	2025-07-30 12:32:10 +01:00
Eric Eastwood	d4af2970f3	Refactor `Histogram` metrics to be homeserver-scoped (#18724 ) Bulk refactor `Histogram` metrics to be homeserver-scoped. We also add lints to make sure that new `Histogram` metrics don't sneak in without using the `server_name` label (`SERVER_NAME_LABEL`). Part of https://github.com/element-hq/synapse/issues/18592 ### Testing strategy 1. Add the `metrics` listener in your `homeserver.yaml` ```yaml listeners: # This is just showing how to configure metrics either way # # `http` `metrics` resource - port: 9322 type: http bind_addresses: ['127.0.0.1'] resources: - names: [metrics] compress: false # `metrics` listener - port: 9323 type: metrics bind_addresses: ['127.0.0.1'] ``` 1. Start the homeserver: `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Fetch `http://localhost:9322/_synapse/metrics` and/or `http://localhost:9323/metrics` 1. Observe response includes the TODO metrics with the `server_name` label ### Todo - [x] Wait for https://github.com/element-hq/synapse/pull/18656 to merge ### Dev notes ``` LoggingDatabaseConnection make_conn make_pool make_fake_db_pool ``` ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))	2025-07-29 15:35:38 -05:00
Eric Eastwood	5106818bd0	Refactor `GaugeBucketCollector` metrics to be homeserver-scoped (#18715 ) Refactor `GaugeBucketCollector` metrics to be homeserver-scoped Part of https://github.com/element-hq/synapse/issues/18592 ### Testing strategy 1. Add the `metrics` listener in your `homeserver.yaml` ```yaml listeners: # This is just showing how to configure metrics either way # # `http` `metrics` resource - port: 9322 type: http bind_addresses: ['127.0.0.1'] resources: - names: [metrics] compress: false # `metrics` listener - port: 9323 type: metrics bind_addresses: ['127.0.0.1'] ``` 1. Start the homeserver: `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Fetch `http://localhost:9322/_synapse/metrics` and/or `http://localhost:9323/metrics` 1. Adjust the number of [`msecs` in the `looping_call` so that `_read_forward_extremities`](`a82b8a966a/synapse/storage/databases/main/metrics.py (L79)`) runs immediately instead of after an hour. 1. Observe response includes the `synapse_forward_extremities` and `synapse_excess_extremity_events` metrics with the `server_name` label	2025-07-29 11:46:21 -05:00
Eric Eastwood	b7e7f537f1	Refactor background process metrics to be homeserver-scoped (#18670 ) Part of https://github.com/element-hq/synapse/issues/18592 Separated out of https://github.com/element-hq/synapse/pull/18656 because it's a bigger, unique piece of the refactor ### Testing strategy 1. Add the `metrics` listener in your `homeserver.yaml` ```yaml listeners: # This is just showing how to configure metrics either way # # `http` `metrics` resource - port: 9322 type: http bind_addresses: ['127.0.0.1'] resources: - names: [metrics] compress: false # `metrics` listener - port: 9323 type: metrics bind_addresses: ['127.0.0.1'] ``` 1. Start the homeserver: `poetry run synapse_homeserver --config-path homeserver.yaml` 1. Fetch `http://localhost:9322/_synapse/metrics` and/or `http://localhost:9323/metrics` 1. Observe response includes the background processs metrics (`synapse_background_process_start_count`, `synapse_background_process_db_txn_count_total`, etc) with the `server_name` label	2025-07-23 13:28:17 -05:00
reivilibre	875269eb53	Add experimental and incomplete support for MSC4306: Thread Subscriptions. (#18674 ) Implements: [MSC4306](https://github.com/matrix-org/matrix-spec-proposals/blob/rei/msc_thread_subscriptions/proposals/4306-thread-subscriptions.md) (partially) What's missing: - Changes to push rules Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>	2025-07-21 15:54:28 +01:00
Eric Eastwood	cda922830e	Clean up `MetricsResource` and Prometheus hacks (#18687 ) Clean up `MetricsResource`, Prometheus hacks (`_set_prometheus_client_use_created_metrics`), and better document why we care about having a separate `metrics` listener type. These clean-up changes have been split out from https://github.com/element-hq/synapse/pull/18584 since that PR was closed.	2025-07-17 11:57:19 -05:00
Eric Eastwood	88785dbaeb	Refactor cache metrics to be homeserver-scoped (#18604 ) (add `server_name` label to cache metrics). Part of https://github.com/element-hq/synapse/issues/18592	2025-07-16 16:04:57 -05:00
Andrew Morgan	be4c95baf1	Replace PyICU with Rust `icu_segmenter` crate (#18553 ) Co-authored-by: anoa's Codex Agent <codex@amorgan.xyz> Co-authored-by: Quentin Gliech <quenting@element.io>	2025-07-03 11:12:12 +01:00
Andrew Morgan	6791e6e250	Unbreak unit tests with Twisted `25.5.0` by add `parsePOSTFormSubmission` arg to `FakeSite` (#18577 ) Co-authored-by: anoa's Codex Agent <codex@amorgan.xyz>	2025-06-24 11:52:06 +01:00
Erik Johnston	33e0c25279	Clean up old `device_federation_inbox` rows (#18546 ) Fixes https://github.com/element-hq/synapse/issues/17370	2025-06-18 11:58:31 +00:00
Will Hunt	6e600c986e	Don't allow users to ignore themselves. (#18508 ) Fixes the self-ignore issues we've being seeing of reports of by ignoring bad requests from clients. Fixes https://github.com/element-hq/synapse/issues/11963 Fix https://github.com/element-hq/element-web/issues/29969 although this should also be fixed on the client to avoid confusing errors popping up while rejecting invites. Related to https://github.com/matrix-org/matrix-rust-sdk/issues/5073	2025-06-06 15:37:15 +01:00
Will Hunt	8010377a88	Add support for MSC4155 Invite filtering (#18288 ) This implements https://github.com/matrix-org/matrix-spec-proposals/pull/4155, which adds support for a new account data type that blocks an invite based on some conditions in the event contents. --------- Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2025-06-05 11:49:09 +01:00
Devon Hudson	99cbd33630	Merge branch 'master' into develop	2025-05-20 09:36:05 -06:00
dependabot[bot]	9d43bec326	Bump ruff from 0.7.3 to 0.11.10 (#18451 ) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Andrew Morgan <andrew@amorgan.xyz> Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2025-05-20 15:23:30 +01:00
Erik Johnston	67920c0aca	Fix up the topological ordering for events above `MAX_DEPTH` (#18447 ) Synapse previously did not correctly cap the max depth of an event to the max canonical json int. This can cause ordering issues for any events that were sent locally at the time. This background update goes and correctly caps the topological ordering to the new `MAX_DEPTH`. c.f. GHSA-v56r-hwv5-mxg6 --------- Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2025-05-19 13:36:30 +01:00
Devon Hudson	89cb613a4e	Revert "Add total event, unencrypted message, and e2ee event counts to stats reporting" (#18346 ) Reverts element-hq/synapse#18260 It is causing a failure when building release debs for `debian:bullseye` with the following error: ``` sqlite3.OperationalError: near "RETURNING": syntax error ```	2025-04-16 16:41:41 +00:00
Andrew Morgan	a832375bfb	Add total event, unencrypted message, and e2ee event counts to stats reporting (#18260 ) Co-authored-by: Eric Eastwood <erice@element.io>	2025-04-15 07:49:08 -07:00
Devon Hudson	1efb826b54	Delete unreferenced state groups in background (#18254 ) This PR fixes #18154 to avoid de-deltaing state groups which resulted in DB size temporarily increasing until the DB was `VACUUM`'ed. As a result, less state groups will get deleted now. It also attempts to improve performance by not duplicating work when processing state groups it has already processed in previous iterations. ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [X] Pull request is based on the develop branch * [X] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [X] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Erik Johnston <erikj@element.io>	2025-03-21 17:09:49 +00:00
reivilibre	8295de87a7	Revert the background job to clear unreferenced state groups (that was introduced in v1.126.0rc1), due to a suspected issue that causes increased disk usage. (#18222 ) Revert "Add background job to clear unreferenced state groups (#18154)" This mechanism is suspected of inserting large numbers of rows into `state_groups_state`, thus unreasonably increasing disk usage. See: https://github.com/element-hq/synapse/issues/18217 This reverts commit `5121f9210c` (#18154). --------- Signed-off-by: Olivier 'reivilibre <oliverw@matrix.org>	2025-03-07 15:44:13 +00:00
Devon Hudson	5121f9210c	Add background job to clear unreferenced state groups (#18154 ) Fixes #18150 ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [X] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [X] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Erik Johnston <erikj@element.io>	2025-02-25 16:25:39 +00:00
Devon Hudson	ecad88f5c5	Cleanup deleted state group references (#18165 ) ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters))	2025-02-18 14:44:38 +00:00
Erik Johnston	c46d452c7c	Fix bug where purging history could lead to increase in disk space usage (#18131 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Details Build release artifacts / Attach assets to release (push) Blocked by required conditions Details Tests / changes (push) Waiting to run Details Tests / check-sampleconfig (push) Blocked by required conditions Details Tests / check-schema-delta (push) Blocked by required conditions Details Tests / check-lockfile (push) Waiting to run Details Tests / lint (push) Blocked by required conditions Details Tests / Typechecking (push) Blocked by required conditions Details Tests / lint-crlf (push) Waiting to run Details Tests / lint-newsfile (push) Waiting to run Details Tests / lint-pydantic (push) Blocked by required conditions Details Tests / lint-clippy (push) Blocked by required conditions Details Tests / lint-clippy-nightly (push) Blocked by required conditions Details Tests / lint-rustfmt (push) Blocked by required conditions Details Tests / lint-readme (push) Blocked by required conditions Details Tests / linting-done (push) Blocked by required conditions Details Tests / calculate-test-jobs (push) Blocked by required conditions Details Tests / trial (push) Blocked by required conditions Details Tests / trial-olddeps (push) Blocked by required conditions Details Tests / trial-pypy (all, pypy-3.9) (push) Blocked by required conditions Details Tests / sytest (push) Blocked by required conditions Details Tests / export-data (push) Blocked by required conditions Details Tests / portdb (13, 3.9) (push) Blocked by required conditions Details Tests / portdb (17, 3.13) (push) Blocked by required conditions Details Tests / complement (monolith, Postgres) (push) Blocked by required conditions Details Tests / complement (monolith, SQLite) (push) Blocked by required conditions Details Tests / complement (workers, Postgres) (push) Blocked by required conditions Details Tests / cargo-test (push) Blocked by required conditions Details Tests / cargo-bench (push) Blocked by required conditions Details Tests / tests-done (push) Blocked by required conditions Details When purging history, we try and delete any state groups that become unreferenced (i.e. there are no longer any events that directly reference them). When we delete a state group that is referenced by another state group, we "de-delta" that state group so that it no longer refers to the state group that is deleted. There are two bugs with this approach that we fix here: 1. There is a common pattern where we end up storing two state groups when persisting a state event: the state before and after the new state event, where the latter is stored as a delta to the former. When deleting state groups we only deleted the "new" state and left (and potentially de-deltaed) the old state. This was due to a bug/typo when trying to find referenced state groups. 2. There are times where we store unreferenced state groups in the DB, during the purging of history these would not get rechecked and instead always de-deltaed. Instead, we should check for this case and delete any unreferenced state groups rather than de-deltaing them. The effect of the above bugs is that when purging history we'd end up with lots of unreferenced state groups that had been de-deltaed (i.e. stored as the full state). This can lead to dramatic increases in storage space used.	2025-02-03 19:04:19 +00:00
Erik Johnston	27dbb1b429	Add locking to more safely delete state groups: Part 2 (#18130 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Details Build release artifacts / Attach assets to release (push) Blocked by required conditions Details Tests / changes (push) Waiting to run Details Tests / check-sampleconfig (push) Blocked by required conditions Details Tests / check-schema-delta (push) Blocked by required conditions Details Tests / check-lockfile (push) Waiting to run Details Tests / lint (push) Blocked by required conditions Details Tests / Typechecking (push) Blocked by required conditions Details Tests / lint-crlf (push) Waiting to run Details Tests / lint-newsfile (push) Waiting to run Details Tests / lint-pydantic (push) Blocked by required conditions Details Tests / lint-clippy (push) Blocked by required conditions Details Tests / lint-clippy-nightly (push) Blocked by required conditions Details Tests / lint-rustfmt (push) Blocked by required conditions Details Tests / lint-readme (push) Blocked by required conditions Details Tests / linting-done (push) Blocked by required conditions Details Tests / calculate-test-jobs (push) Blocked by required conditions Details Tests / trial (push) Blocked by required conditions Details Tests / trial-olddeps (push) Blocked by required conditions Details Tests / trial-pypy (all, pypy-3.9) (push) Blocked by required conditions Details Tests / sytest (push) Blocked by required conditions Details Tests / export-data (push) Blocked by required conditions Details Tests / portdb (13, 3.9) (push) Blocked by required conditions Details Tests / portdb (17, 3.13) (push) Blocked by required conditions Details Tests / complement (monolith, Postgres) (push) Blocked by required conditions Details Tests / complement (monolith, SQLite) (push) Blocked by required conditions Details Tests / complement (workers, Postgres) (push) Blocked by required conditions Details Tests / cargo-test (push) Blocked by required conditions Details Tests / cargo-bench (push) Blocked by required conditions Details Tests / tests-done (push) Blocked by required conditions Details This actually makes it so that deleting state groups goes via the new mechanism. c.f. #18107	2025-02-03 17:58:55 +00:00
Erik Johnston	aa6e5c2ecb	Add locking to more safely delete state groups: Part 1 (#18107 ) Currently we don't really have anything that stops us from deleting state groups when an in-flight event references it. This is a fairly rare race currently, but we want to be able to more aggressively delete state groups so it is important to address this to ensure that the database remains valid. This implements the locking, but doesn't actually use it. See the class docstring of the new data store for an explanation for how this works. --------- Co-authored-by: Devon Hudson <devon.dmytro@gmail.com>	2025-02-03 17:29:15 +00:00
Eric Eastwood	aab3672037	Bust `_membership_stream_cache` cache when current state changes (#17732 ) Some checks failed Tests / check-lockfile (push) Has been cancelled Details Tests / lint-crlf (push) Has been cancelled Details Tests / lint-newsfile (push) Has been cancelled Details Deploy the documentation / GitHub Pages (push) Has been cancelled Details Build release artifacts / Build .deb packages (push) Has been cancelled Details Build release artifacts / Attach assets to release (push) Has been cancelled Details Tests / check-sampleconfig (push) Has been cancelled Details Tests / check-schema-delta (push) Has been cancelled Details Tests / lint (push) Has been cancelled Details Tests / Typechecking (push) Has been cancelled Details Tests / lint-pydantic (push) Has been cancelled Details Tests / lint-clippy (push) Has been cancelled Details Tests / lint-clippy-nightly (push) Has been cancelled Details Tests / lint-rustfmt (push) Has been cancelled Details Tests / lint-readme (push) Has been cancelled Details Tests / linting-done (push) Has been cancelled Details Tests / calculate-test-jobs (push) Has been cancelled Details Tests / trial (push) Has been cancelled Details Tests / trial-olddeps (push) Has been cancelled Details Tests / trial-pypy (all, pypy-3.9) (push) Has been cancelled Details Tests / sytest (push) Has been cancelled Details Tests / export-data (push) Has been cancelled Details Tests / portdb (13, 3.9) (push) Has been cancelled Details Tests / portdb (17, 3.13) (push) Has been cancelled Details Tests / complement (monolith, Postgres) (push) Has been cancelled Details Tests / complement (monolith, SQLite) (push) Has been cancelled Details Tests / complement (workers, Postgres) (push) Has been cancelled Details Tests / cargo-test (push) Has been cancelled Details Tests / cargo-bench (push) Has been cancelled Details Tests / tests-done (push) Has been cancelled Details This is particularly a problem in a state reset scenario where the membership might change without a corresponding event. This PR is targeting a scenario where a state reset happens which causes room membership to change. Previously, the cache would just hold onto stale data and now we properly bust the cache in this scenario. We have a few tests for these scenarios which you can see are now fixed because we can remove the `FIXME` where we were previously manually busting the cache in the test itself. This is a general Synapse thing so by it's nature it helps out Sliding Sync. Fix https://github.com/element-hq/synapse/issues/17368 Prerequisite for https://github.com/element-hq/synapse/issues/17929 --- Match when are busting `_curr_state_delta_stream_cache`	2025-01-08 10:11:09 -06:00
Devon Hudson	eda735e4bb	Remove support for python 3.8 (#17908 ) ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [X] Pull request is based on the develop branch * [X] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [X] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>	2024-11-06 19:36:01 +00:00
Eric Eastwood	a5e16a4ab5	Sliding Sync: Reset `forgotten` status when membership changes (like rejoining a room) (#17835 ) Some checks failed Tests / check-lockfile (push) Has been cancelled Details Tests / lint-crlf (push) Has been cancelled Details Tests / lint-newsfile (push) Has been cancelled Details Deploy the documentation / GitHub Pages (push) Has been cancelled Details Build release artifacts / Build .deb packages (push) Has been cancelled Details Build release artifacts / Attach assets to release (push) Has been cancelled Details Tests / check-sampleconfig (push) Has been cancelled Details Tests / check-schema-delta (push) Has been cancelled Details Tests / lint (push) Has been cancelled Details Tests / Typechecking (push) Has been cancelled Details Tests / lint-pydantic (push) Has been cancelled Details Tests / lint-clippy (push) Has been cancelled Details Tests / lint-clippy-nightly (push) Has been cancelled Details Tests / lint-rustfmt (push) Has been cancelled Details Tests / lint-readme (push) Has been cancelled Details Tests / linting-done (push) Has been cancelled Details Tests / calculate-test-jobs (push) Has been cancelled Details Tests / trial (push) Has been cancelled Details Tests / trial-olddeps (push) Has been cancelled Details Tests / trial-pypy (all, pypy-3.8) (push) Has been cancelled Details Tests / sytest (push) Has been cancelled Details Tests / export-data (push) Has been cancelled Details Tests / portdb (11, 3.8) (push) Has been cancelled Details Tests / portdb (15, 3.11) (push) Has been cancelled Details Tests / complement (monolith, Postgres) (push) Has been cancelled Details Tests / complement (monolith, SQLite) (push) Has been cancelled Details Tests / complement (workers, Postgres) (push) Has been cancelled Details Tests / cargo-test (push) Has been cancelled Details Tests / cargo-bench (push) Has been cancelled Details Tests / tests-done (push) Has been cancelled Details Reset `sliding_sync_membership_snapshots` -> `forgotten` status when membership changes (like rejoining a room). Fix https://github.com/element-hq/synapse/issues/17781 ### What was the problem before? Previously, if someone used `/forget` on one of their rooms, it would update `sliding_sync_membership_snapshots` as expected but when someone rejoined the room (or had any membership change), the upsert didn't overwrite and reset the `forgotten` status so it remained `forgotten` and invisible down the Sliding Sync endpoint.	2024-10-22 11:06:46 +01:00
Eric Eastwood	adda2a4613	Sliding Sync: Slight optimization when fetching state for the room (`get_events_as_list(...)`) (#17718 ) Spawning from @kegsay [pointing out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re) that the Sliding Sync endpoint doesn't handle a large room with a lot of state well on initial sync (requesting all state via `required_state: [ ["",""] ]`) (it just takes forever). After investigating further, the slow part is just `get_events_as_list(...)` fetching all of the current state ID's out for the room (which can be 100k+ events for rooms with a lot of membership). This is just a slow thing in Synapse in general and the same thing happens in Sync v2 or the `/state` endpoint. --- The only idea I had to improve things was to use `batch_iter` to only try fetching a fixed amount at a time instead of working with large maps, lists, and sets. This doesn't seem to have much effect though. There is already a `batch_iter(event_ids, 200)` in `_fetch_event_rows(...)` for when we actually have to touch the database and that's inside a queue to deduplicate work. I did notice one slight optimization to use `get_events_as_list(...)` directly instead of `get_events(...)`. `get_events(...)` just turns the result from `get_events_as_list(...)` into a dict and since we're just iterating over the events, we don't need the dict/map.	2024-10-14 13:47:35 +01:00
Eric Eastwood	c2e5e9e67c	Sliding Sync: Avoid fetching left rooms and add back `newly_left` rooms (#17725 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Details Build release artifacts / Attach assets to release (push) Blocked by required conditions Details Tests / changes (push) Waiting to run Details Tests / check-sampleconfig (push) Blocked by required conditions Details Tests / check-schema-delta (push) Blocked by required conditions Details Tests / check-lockfile (push) Waiting to run Details Tests / lint (push) Blocked by required conditions Details Tests / Typechecking (push) Blocked by required conditions Details Tests / lint-crlf (push) Waiting to run Details Tests / lint-newsfile (push) Waiting to run Details Tests / lint-pydantic (push) Blocked by required conditions Details Tests / lint-clippy (push) Blocked by required conditions Details Tests / lint-clippy-nightly (push) Blocked by required conditions Details Tests / lint-rustfmt (push) Blocked by required conditions Details Tests / lint-readme (push) Blocked by required conditions Details Tests / linting-done (push) Blocked by required conditions Details Tests / calculate-test-jobs (push) Blocked by required conditions Details Tests / trial (push) Blocked by required conditions Details Tests / trial-olddeps (push) Blocked by required conditions Details Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Details Tests / sytest (push) Blocked by required conditions Details Tests / export-data (push) Blocked by required conditions Details Tests / portdb (11, 3.8) (push) Blocked by required conditions Details Tests / portdb (15, 3.11) (push) Blocked by required conditions Details Tests / complement (monolith, Postgres) (push) Blocked by required conditions Details Tests / complement (monolith, SQLite) (push) Blocked by required conditions Details Tests / complement (workers, Postgres) (push) Blocked by required conditions Details Tests / cargo-test (push) Blocked by required conditions Details Tests / cargo-bench (push) Blocked by required conditions Details Tests / tests-done (push) Blocked by required conditions Details Performance optimization: We can avoid fetching rooms that the user has left themselves (which could be a significant amount), then only add back rooms that the user has `newly_left` (left in the token range of an incremental sync). It's a lot faster to fetch less rooms than fetch them all and throw them away in most cases. Since the user only leaves a room (or is state reset out) once in a blue moon, we can avoid a lot of work. Based on @erikjohnston's branch, erikj/ss_perf --------- Co-authored-by: Erik Johnston <erik@matrix.org>	2024-09-19 10:07:18 -05:00
Eric Eastwood	16af80b8fb	Sliding Sync: Use Sliding Sync tables for sorting (#17693 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Details Build release artifacts / Attach assets to release (push) Blocked by required conditions Details Tests / changes (push) Waiting to run Details Tests / check-sampleconfig (push) Blocked by required conditions Details Tests / check-schema-delta (push) Blocked by required conditions Details Tests / check-lockfile (push) Waiting to run Details Tests / lint (push) Blocked by required conditions Details Tests / Typechecking (push) Blocked by required conditions Details Tests / lint-crlf (push) Waiting to run Details Tests / lint-newsfile (push) Waiting to run Details Tests / lint-pydantic (push) Blocked by required conditions Details Tests / lint-clippy (push) Blocked by required conditions Details Tests / lint-clippy-nightly (push) Blocked by required conditions Details Tests / lint-rustfmt (push) Blocked by required conditions Details Tests / lint-readme (push) Blocked by required conditions Details Tests / linting-done (push) Blocked by required conditions Details Tests / calculate-test-jobs (push) Blocked by required conditions Details Tests / trial (push) Blocked by required conditions Details Tests / trial-olddeps (push) Blocked by required conditions Details Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Details Tests / sytest (push) Blocked by required conditions Details Tests / export-data (push) Blocked by required conditions Details Tests / portdb (11, 3.8) (push) Blocked by required conditions Details Tests / portdb (15, 3.11) (push) Blocked by required conditions Details Tests / complement (monolith, Postgres) (push) Blocked by required conditions Details Tests / complement (monolith, SQLite) (push) Blocked by required conditions Details Tests / complement (workers, Postgres) (push) Blocked by required conditions Details Tests / cargo-test (push) Blocked by required conditions Details Tests / cargo-bench (push) Blocked by required conditions Details Tests / tests-done (push) Blocked by required conditions Details Use Sliding Sync tables for sorting (`bulk_get_last_event_pos_in_room_before_stream_ordering(...)` -> `_bulk_get_max_event_pos(...)`)	2024-09-11 12:16:24 -05:00
Erik Johnston	596b96411b	Sliding sync: various fixups to the background update (#17652 )	2024-09-11 15:38:46 +01:00
Erik Johnston	588e5b521d	Sliding Sync: Retrieve fewer events from DB in sync (#17688 ) When using timeline limit of 1 we end up fetching 2 events from the DB purely to tell if the response was "limited" or not. Lets not do that.	2024-09-10 09:52:42 +01:00

1 2 3 4 5 ...

779 Commits