feat(telemetry): integrate OpenTelemetry observability stack with health metrics

- Add OpenTelemetry SDK, OTLP exporter, Prometheus integration
- Implement connection tracking with active/total/disconnection metrics
- Add health endpoint with uptime and connection counts
- Integrate tracing spans for socket events and engine messages
- Add metrics collection for event handling duration
- Update health endpoint to include live runtime state
- Add graceful telemetry shutdown in main function
- Implement engine session active metrics tracking
- Add namespace-specific attributes to connection metrics
- Introduce message edit history retrieval endpoint
- Add scheduled message CRUD operations and dispatcher
- Update Socket.IO event registration with observability
- Refactor component update to remove dead code allowance
- Add comprehensive environment variables documentation
- Implement detailed development guidelines in AGENTS.md
This commit is contained in:
zhenyi
2026-06-11 13:53:29 +08:00
parent 40241e5db3
commit 0dbac480ae
22 changed files with 3116 additions and 64 deletions
+11 -1
View File
@@ -129,6 +129,12 @@ impl SessionStore {
sid
);
}
if let Some(m) = crate::telemetry::metrics::try_get() {
m.engine_sessions_active.add(
1,
&[opentelemetry::KeyValue::new("transport", transport.as_str())],
);
}
rx
}
@@ -137,7 +143,11 @@ impl SessionStore {
}
pub fn remove(&self, sid: &str) {
self.sessions.remove(sid);
if self.sessions.remove(sid).is_some()
&& let Some(m) = crate::telemetry::metrics::try_get()
{
m.engine_sessions_active.add(-1, &[]);
}
}
pub fn exists(&self, sid: &str) -> bool {