Deepseek Performance Monitoring
What is the project about?
The project is a performance monitoring solution specifically designed for the Deepseek API. It measures key performance indicators to assess the API's responsiveness and throughput.
What problem does it solve?
The project helps users understand and track the performance of the Deepseek API over time. It allows for identifying potential bottlenecks, performance degradation, or issues with the API service. It provides quantifiable data for monitoring service level agreements (SLAs).
What are the features of the project?
- Latency Measurement: Measures the time it takes to receive the first token (first_token_latency_ms) and the total response time (total_latency_ms) in milliseconds.
- Throughput Calculation: Calculates the number of tokens processed per second (tokens_per_second).
- Configurable Monitoring: Allows users to set the monitoring interval and the total duration of the test.
- Comprehensive Logging: Saves detailed results to a CSV file, including timestamps, latency, tokens per second, and token counts.
- Multi-Model Support: Supports performance testing for multiple Deepseek models, including
deepseek-chat
anddeepseek-reasoner
. - Random Prompt Generation: Uses a variety of prompts to simulate real-world usage and ensure diverse testing scenarios.
- Error Handling: Logs errors to both the console and the CSV file, allowing the script to continue execution.
- Statistical Analysis: Provides basic performance statistics, including average, maximum, and minimum values for TPS and latency.
- Command-line Interface: Offers command-line options for easy configuration.
What are the technologies used in the project?
- Python 3: The primary programming language.
- LiteLLM: A library used to interact with the Deepseek API.
- Deepseek API: The API being monitored.
What are the benefits of the project?
- Performance Insight: Provides clear metrics to understand Deepseek API performance.
- Proactive Monitoring: Enables early detection of performance issues.
- Data-Driven Decisions: Facilitates informed decisions about API usage and scaling.
- Easy to Use: Simple setup and execution with command-line options.
- Customizable: Adaptable to different testing needs through configuration.
What are the use cases of the project?
- API Performance Benchmarking: Establish baseline performance metrics for the Deepseek API.
- Continuous Monitoring: Regularly track API performance to identify trends and anomalies.
- Load Testing: Simulate different levels of usage to assess API stability under stress.
- Service Level Agreement (SLA) Monitoring: Ensure the Deepseek API meets performance requirements.
- Troubleshooting: Diagnose performance issues and identify potential causes.
- Capacity Planning: Determine the capacity and scalability of the Deepseek API for future needs.
