coolify

mirror of https://github.com/coollabsio/coolify.git synced 2025-12-28 05:34:50 +00:00

Author	SHA1	Message	Date
Andras Bacsai	6d47d24169	Fix standalone database "restarting" status flickering and add restart tracking - Fix status flickering: Track databases in active/transient states (restarting, starting, created, paused) not just running - Add isActiveOrTransient() helper to distinguish between active states and terminal states (exited, dead) - Add safeguard: Protect updateNotFoundDatabaseStatus() from marking as exited when containers collection is empty - Add restart_count tracking: New migration adds restart_count, last_restart_at, last_restart_type to all standalone database tables - Update 8 database models with $casts for new restart tracking fields - Update GetContainersStatus to extract RestartCount from Docker and update database models - Reset restart tracking when database exits completely 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2025-12-17 16:25:41 +01:00
Andras Bacsai	0efa4af5c3	Optimize PushServerUpdateJob performance with batch updates and async jobs - Eager load service applications and databases to eliminate N+1 queries - Replace individual model updates with batch database updates for applications, previews, and services - Move connectProxyToNetworks to async ConnectProxyToNetworksJob to avoid blocking status updates - Optimize Server.databases() and applications() methods with efficient database queries - Use flatMap for cleaner collection transformations 🤖 Generated with Claude Code Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2025-12-15 14:06:32 +01:00
Andras Bacsai	66e81d6d96	Fix container status display: preserve "Restarting" for applications and sub-resources Add preserveRestarting parameter to ContainerStatusAggregator to allow applications and service sub-resources to display "Restarting" status instead of being marked as "Degraded". This gives better visibility into container restart behavior. - Update ContainerStatusAggregator to accept preserveRestarting parameter (defaults to false) - Update GetContainersStatus to use preserveRestarting: true for applications and service sub-resources - Update PushServerUpdateJob to use preserveRestarting: true for applications and service sub-resources - Add comprehensive documentation explaining the parameter behavior and when to use it 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-12-03 08:23:35 +01:00
Andras Bacsai	e0dc12678b	fix: comprehensive SERVICE_URL/SERVICE_FQDN handling improvements and queue reliability fixes (#7275 )	2025-11-24 11:47:11 +01:00
Andras Bacsai	ac9eca3c05	fix: don't show health status for exited containers Exited containers don't run health checks, so showing "(unhealthy)" is misleading. This fix ensures exited status displays without health suffixes across all monitoring systems (SSH, Sentinel, services, etc.) and at the UI layer for backward compatibility with existing data. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-24 09:09:37 +01:00
Andras Bacsai	85b73a8c00	fix: initialize Collection properties to handle queue deserialization edge cases	2025-11-21 12:25:25 +01:00
Andras Bacsai	ae6eef3cdb	feat(tests): add comprehensive tests for ContainerStatusAggregator and serverStatus accessor - Introduced tests for ContainerStatusAggregator to validate status aggregation logic across various container states. - Implemented tests to ensure serverStatus accessor correctly checks server infrastructure health without being affected by container status. - Updated ExcludeFromHealthCheckTest to verify excluded status handling in various components. - Removed obsolete PushServerUpdateJobStatusAggregationTest as its functionality is covered elsewhere. - Updated version number for sentinel to 0.0.17 in versions.json.	2025-11-20 17:31:07 +01:00
Andras Bacsai	14bba8ba86	fix: correct Sentinel default health status and remove debug logging This commit addresses container status reporting issues and removes debug logging: Primary Fix: - Changed PushServerUpdateJob to default to 'unknown' instead of 'unhealthy' when health_status field is missing from Sentinel data - This ensures containers WITHOUT healthcheck defined are correctly reported as "unknown" not "unhealthy" - Matches SSH path behavior (GetContainersStatus) which already defaulted to 'unknown' Service Multi-Container Aggregation: - Implemented service container status aggregation (same pattern as applications) - Added serviceContainerStatuses collection to both Sentinel and SSH paths - Services now aggregate status using priority: unhealthy > unknown > healthy - Prevents race conditions where last-processed container would win Debug Logging Cleanup: - Removed all [STATUS-DEBUG] logging statements (25 total) - Removed all ray() debugging calls (3 total) - Removed proof_unknown_preserved and health_status_was_null debug fields - Code is now production-ready Test Coverage: - Added 2 new tests for Sentinel default health status behavior - Added 5 new tests for service aggregation in SSH path - All 16 tests pass (66 assertions) Note: The root cause was identified as Sentinel (Go binary) also defaulting to "unhealthy". That will need a separate fix in the Sentinel codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 11:10:34 +01:00
Andras Bacsai	747a48b933	debug: add detailed Sentinel container processing logging Added comprehensive logging to track why applicationContainerStatuses collection is empty in PushServerUpdateJob. ## Logging Added ### 1. Raw Sentinel Data (line 113-118) Logs: Complete container data received from Sentinel Purpose: See exactly what Sentinel is sending Data: Container count and full container array with all labels ### 2. Container Processing Loop (line 157-163) Logs: Every container as it's being processed Purpose: Track which containers enter the processing loop Data: Container name, status, all labels, coolify.managed flag ### 3. Skipped Containers - Not Managed (line 165-171) Logs: Containers without coolify.managed label Purpose: Identify containers being filtered out early Data: Container name ### 4. Successful Container Addition (line 193-198) Logs: When container is successfully added to applicationContainerStatuses Purpose: Confirm containers ARE being processed Data: Application ID, container name, container status ### 5. Missing com.docker.compose.service Label (line 200-206) Logs: Containers skipped due to missing com.docker.compose.service Purpose: Identify the most likely root cause Data: Container name, application ID, all labels ## Why This Matters User reported applicationContainerStatuses is empty (`[]`) even though Sentinel is pushing updates. This logging will reveal: 1. Is Sentinel sending containers at all? 2. Are containers filtered by coolify.managed check? 3. Is com.docker.compose.service label missing? (most likely) 4. What labels IS Sentinel actually sending? ## Expected Findings Based on investigation, the issue is likely: - Sentinel is NOT sending com.docker.compose.service in labels - Or Sentinel uses a different label format/name - Containers pass all other checks but fail on line 190-206 ## Next Steps After logs appear, we'll see exactly which filter is blocking containers and can fix the root cause (likely need to extract com.docker.compose.service from container name or use a different label source). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-20 08:34:42 +01:00
Andras Bacsai	d2d9c1b2bc	debug: add comprehensive status change logging Added detailed debug logging to all status update paths to help diagnose why "unhealthy" status appears in the UI. ## Logging Added ### 1. PushServerUpdateJob (Sentinel updates) Location: Lines 303-315 Logs: Status changes from Sentinel push updates Data tracked: - Old vs new status - Container statuses that led to aggregation - Status flags (hasRunning, hasUnhealthy, hasUnknown) ### 2. GetContainersStatus (SSH updates) Location: Lines 441-449, 346-354, 358-365 Logs: Status changes from SSH-based checks Scenarios: - Normal status aggregation - Recently restarted containers (kept as degraded) - Applications not running (set to exited) Data tracked: - Old vs new status - Container statuses - Restart count and timing - Whether containers exist ### 3. Application Model Status Accessor Location: Lines 706-712, 726-732 Logs: When status is set without explicit health information Issue: Highlights cases where health defaults to "unhealthy" Data tracked: - Raw value passed to setter - Final result after default applied ## How to Use ### Enable Debug Logging Edit `.env` or `config/logging.php` to set log level to debug: ``` LOG_LEVEL=debug ``` ### Monitor Logs ```bash tail -f storage/logs/laravel.log \| grep STATUS-DEBUG ``` ### Log Format All logs use `[STATUS-DEBUG]` prefix for easy filtering: ``` [2025-11-19 13:00:00] local.DEBUG: [STATUS-DEBUG] Sentinel status change { "source": "PushServerUpdateJob", "app_id": 123, "app_name": "my-app", "old_status": "running:unknown", "new_status": "running:healthy", "container_statuses": [...], "flags": {...} } ``` ## What to Look For 1. Default to unhealthy: Check Application model accessor logs 2. Status flipping: Compare timestamps between Sentinel and SSH updates 3. Incorrect aggregation: Check flags and container_statuses 4. Stale database values: Check if old_status persists across multiple logs ## Next Steps After gathering logs, we can: 1. Identify the exact source of "unhealthy" status 2. Determine if it's a default issue, aggregation bug, or timing problem 3. Apply targeted fix based on evidence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:52:08 +01:00
Andras Bacsai	6b62847a11	fix: preserve unknown health status in Sentinel updates (PushServerUpdateJob) ## Problem Services with "running (unknown)" status were periodically changing to "running (healthy)" every ~30 seconds when Sentinel pushed updates. This was confusing for users and inconsistent with SSH-based status checks. ## Root Cause `PushServerUpdateJob::aggregateMultiContainerStatuses()` was missing logic to track "unknown" health state. It only tracked "unhealthy" and defaulted everything else to "healthy". When Sentinel pushed updates with "running (unknown)" containers: - The job saw `hasRunning = true` and `hasUnhealthy = false` - It incorrectly returned "running (healthy)" instead of "running (unknown)" ## Solution Updated `PushServerUpdateJob` to match the logic in `GetContainersStatus`: 1. Added `$hasUnknown` tracking variable 2. Check for "unknown" in status strings (alongside "unhealthy") 3. Implement 3-way priority: unhealthy > unknown > healthy This ensures consistency between: - SSH-based updates (`GetContainersStatus`) - Sentinel-based updates (`PushServerUpdateJob`) - UI display logic ## Changes - app/Jobs/PushServerUpdateJob.php: Added unknown status tracking - tests/Unit/PushServerUpdateJobStatusAggregationTest.php: New comprehensive tests - tests/Unit/ExcludeFromHealthCheckTest.php: Updated to match current implementation ## Testing All 31 status-related unit tests passing: - 18 tests in ContainerHealthStatusTest - 8 tests in ExcludeFromHealthCheckTest (updated) - 6 tests in PushServerUpdateJobStatusAggregationTest (new) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-19 13:40:58 +01:00
Andras Bacsai	08d257535a	fix(docker): enhance container status aggregation for multi-container applications, including exclusion handling based on docker-compose configuration Some checks are pending Staging Build / amd64 (push) Waiting to run Details Staging Build / aarch64 (push) Waiting to run Details Staging Build / merge-manifest (push) Blocked by required conditions Details	2025-09-13 20:32:15 +02:00
Andras Bacsai	0f5c988658	fix(horizon): add silenced jobs	2025-07-12 14:44:32 +02:00
Andras Bacsai	24688b2ad8	fix(jobs): update middleware to use expireAfter for WithoutOverlapping in multiple job classes	2025-07-01 10:50:27 +02:00
Andras Bacsai	f9a0ca2ca6	refactor(proxy): update StartProxy calls to use named parameter for async option	2025-06-16 13:13:01 +02:00
Andras Bacsai	ddcb14500d	refactor(proxy-status): refactored how the proxy status is handled on the UI and on the backend feat(cloudflare): improved cloudflare tunnel automated installation	2025-06-06 14:47:54 +02:00
Andras Bacsai	97ec579910	refactor(push-server-update): enhance application preview handling by incorporating pull request IDs and adding status update protections	2025-06-04 10:03:36 +02:00
Andras Bacsai	9883cef26d	refactor(jobs): update middleware to include job-specific identifiers for WithoutOverlapping	2025-05-29 17:31:43 +02:00
Andras Bacsai	0369909408	fix(PushServerUpdateJob): add null checks before updating application and database statuses	2025-05-29 10:47:26 +02:00
Andras Bacsai	c6278a06ba	refactor(jobs): unify middleware configuration to prevent job release after expiration for DockerCleanupJob and PushServerUpdateJob	2025-05-07 14:42:42 +02:00
Andras Bacsai	b78f2cccff	refactor(jobs): update WithoutOverlapping middleware to use expireAfter for better queue management	2025-04-18 09:52:32 +02:00
Andras Bacsai	b09f0043d1	fix: restrict jobs on cloud fix: restrict sentinel endpoint	2025-01-10 11:54:45 +01:00
Andras Bacsai	7dc65dfd79	fix: make sure important jobs/actions are running on high prio queue Some checks are pending Staging Build / amd64 (push) Waiting to run Details Staging Build / aarch64 (push) Waiting to run Details Staging Build / merge-manifest (push) Blocked by required conditions Details	2024-11-22 11:16:01 +01:00
Andras Bacsai	275edb6c1f	put a few things on high queue	2024-11-06 12:33:56 +01:00
Lucas Michot	8e1444eaa7	Get rid of many useless blank lines	2024-10-31 17:44:01 +01:00
Andras Bacsai	96ca72fcdb	refactor server view (phuuu)	2024-10-30 20:03:30 +01:00
Lucas Michot	5b6e466e0c	Remove some useless catch blocks	2024-10-28 14:37:00 +01:00
Lucas Michot	d557a22b91	Remove all ray() calls	2024-10-28 13:51:23 +01:00
Andras Bacsai	8c96ab52d7	feat: notification rate limiter Some checks are pending Staging Build / amd64 (push) Waiting to run Details Staging Build / aarch64 (push) Waiting to run Details Staging Build / merge-manifest (push) Blocked by required conditions Details fix: limit server up / down notification limits	2024-10-25 15:13:23 +02:00
Andras Bacsai	621e063bf1	Refactor PushServerUpdateJob to implement ShouldBeEncrypted interface	2024-10-24 15:16:00 +02:00
Andras Bacsai	ac768e5313	feat: limit storage check emails feat: sentinel should send storage usage	2024-10-22 14:01:36 +02:00
Andras Bacsai	537630acc6	Refactor PushServerUpdateJob to handle container restart notifications	2024-10-22 11:42:24 +02:00
Andras Bacsai	d7efe8a6d1	fix: no sentinel for swarm yet	2024-10-22 11:29:43 +02:00
Andras Bacsai	4c95647b96	feat: cleanup sentinel on server deletion fix: Sentinel should not be enabled on build servers	2024-10-17 11:21:43 +02:00
Andras Bacsai	2702fbc284	Refactor logging in PushServerUpdateJob, Application, and SentinelSeeder Some checks failed Staging Build / amd64 (push) Has been cancelled Details Staging Build / aarch64 (push) Has been cancelled Details Staging Build / merge-manifest (push) Has been cancelled Details	2024-10-15 17:03:50 +02:00
Andras Bacsai	d446cd4f31	sentinel updates	2024-10-15 13:39:19 +02:00
Andras Bacsai	81db57002b	Refactor PushServerUpdateJob to handle multiple servers, previews, and emails Some checks are pending Staging Build / amd64 (push) Waiting to run Details Staging Build / aarch64 (push) Waiting to run Details Staging Build / merge-manifest (push) Blocked by required conditions Details	2024-10-14 22:53:16 +02:00
Andras Bacsai	fdeb9353be	chore: Update project service configuration view	2024-10-14 19:45:03 +02:00
Andras Bacsai	1f72321681	fix: sentinel	2024-10-14 18:04:36 +02:00
Andras Bacsai	8a2c9f3d44	updates sentinel	2024-10-14 17:54:29 +02:00
Andras Bacsai	b2e515f770	sentinel	2024-10-14 13:32:36 +02:00
Andras Bacsai	1f193d465d	sentinel updates	2024-10-14 12:07:37 +02:00

42 Commits