Pagerduty has indicated a potential incident with our Furnishment API -- Bloom is actively investigating
Incident Report for bloomcredit
Resolved
This incident has been resolved.
Posted Mar 29, 2024 - 18:54 UTC
Investigating
Affected Services/API Endpoints:

Furnishment API

Description:

We observed a degradation in the performance of our Furnishment Product Catalog API. Response times for API requests have increased significantly, resulting in delays in retrieving product information for both internal systems and external integrations. This degradation in performance impacts the user experience and may lead to decreased efficiency for customers relying on our API services.

Current Status:

The incident is under active investigation. Our engineering team is analyzing system metrics and logs to identify the root cause of the performance degradation. Initial investigations suggest a potential bottleneck in the database query processing layer.

Root Cause Analysis (if known):

Preliminary analysis indicates that the performance degradation may be attributed to increased database query load, potentially caused by a surge in API requests or inefficient query execution.

Actions Taken:

- Engaged the engineering team to investigate and troubleshoot the performance issues.
- Implemented temporary optimizations to alleviate the load on the affected database server.
- Monitored system metrics and performance indicators to track the effectiveness of implemented optimizations.
- Notified relevant internal teams, including product management and customer support, about the ongoing incident.

Next Steps:

Continue investigating the root cause of the performance degradation, focusing on database query optimization and scalability improvements.
Implement long-term solutions to enhance the resilience and scalability of the Product Catalog API.
Communicate updates internally and externally as the investigation progresses and performance improvements are achieved.

Estimated Time to Resolution (ETR):

The estimated time to resolution is currently unknown. Updates will be provided as more information becomes available and improvements are implemented.

Communication Plan:

Internal communication: Regular updates will be provided to relevant teams via email, Slack channels, and virtual meetings.
External communication: A public status page will be updated with information about the incident and expected resolution times. Additionally, email notifications and slack messages will be sent to impacted customers and partners.

Internal Notes:

Additional resources and support have been allocated to the engineering team to expedite the resolution process and minimize the impact on customers.


Follow-up:

A post-incident analysis will be conducted to identify contributing factors and implement preventive measures to mitigate similar performance issues in the future.

Incident Owner: Furnishment Team

Key Stakeholders: Product Management, Customer Support
Posted Mar 29, 2024 - 18:37 UTC
This incident affected: Furnishment API (Submission Engine, Data Management).