CodeJudge: Algorithmic programming challenges
Akash GirmeSource Code on Github
Visit Project
Application System Design
Lets understand the flow of application
1. Client
- Role: User Interface
- Description: The client is the front-end interface where users can write and submit code for a specific problem.
- Flow:
- Code Submission: The user writes code and submits it through the front end.
- API Interaction: The client sends the code and problem details as a request to the API.
- Polling: After submission, the client continuously polls the API server (e.g., every second) to check the submission status until a result is received.
2. API Server
- Role: Backend Service for Handling Submissions, Problems.
- Description: The API handles incoming requests from the client, manages submission records, and coordinates with the Code Execution Worker.
- Flow:
- Request Reception: The API receives the code submission request from the client.
- Acknowledgment: Instead of creating submission entry in primary sql database, Server stores the submission payload in redis, queue the task in message queue and generate unique request ID (UUID ) and send as response to client.
- Message Queue Events: The API server continuously listen to message queue events (successful, failed, pending, running etc.). When there is successful job server fetch the result of that job from redis with request ID since worker stores the result of executionin redis once completed.
- Response to Client: The API responds to the client with a request ID and continues to handle polling requests from the client. When client ask for status server checks the submission result in redis with reference to request ID and send response accordingly.
3. Worker
- Role: Job/Code Execution
- Description: Worker responsible for consuming tasks from the task queue, executing the code in their respective language environment (e.g., C, C++, Go), and store result in redis.
- Flow:
- Task Consumption: The worker consumes a job from the task queue, which contains the code, language and test cases.
- Code Execution: The worker executes the code in a secure, isolated environment (In containerize environment with linux kernel level sandboxing (isolate)) and runs the code against the test cases.
- Result Posting: After execution, the worker stores the result (e.g., output, errors, time/memory usage) in the redis with request ID as key for further processing by the API Server.
4. Message/Job Queue
- Role: Message Queue
- Description: Message Queue implemented with BullMQ works as coordination between API server and workers (could be multiple workers are running).
Flow Diagram
Sequence Diagram
Techniques & Methods for optimal workflow
-
Database Optimization: By offloading large objects (code and error like compile error or runtime error) to an object store, the system reduces database load and improves scalability, making the database primarily responsible for metadata storage (e.g., submission entries, problems & test cases).
-
Asynchronous Processing: The system uses task queues to handle submissions asynchronously. This ensures that the client isn’t blocked while waiting for code execution, and workers can process jobs at their own pace.
-
Isolated Execution: Workers execute code in isolated environments (using isolate tool, takes advantage of features specific to the Linux kernel, like namespaces and control groups) to ensure security and control over resources like CPU and memory. This isolation prevents one job from affecting others & system.
-
Polling Mechanism: The client continuously polls (short polling) the API server to check the submission status. This ensures that the client remains responsive while waiting for the result.
-
Queue Management: Redis (BullMQ) manages task distribution to workers and collects results. It allows for scalable and distributed job processing.
Database schema
The database schema consists of six tables, each designed to capture different aspects of the application’s functionality. Here’s a detailed overview of the relationships among these tables: Database Tables and Relationships
-
User Table: Represents the users of the application.
A user can be associated with multiple submissions and solutions. -
Problem Table: Contains details about various problems created within the application.
Each problem is authored by a single user, establishing a one-to-many relationship between the User and Problem tables (one user can be the author of multiple problems, but each problem has only one author). -
Tag Table: Holds tags that can be associated with problems.
The relationship between the Problem and Tag tables is many-to-many, meaning one problem can have multiple tags, and one tag can be associated with multiple problems. -
Test Cases Table: Stores test cases related to each problem.
There is a one-to-many relationship between the Problem and Test Cases tables (one problem can have multiple test cases, but each test case is linked to only one problem). -
Submissions Table: Records submissions made by users for specific problems.
This table has a many-to-one relationship with the Problem table (one problem can have many submissions, but each submission is tied to only one problem). Additionally, there is a many-to-one relationship with the User table (one user can make multiple submissions, but each submission belongs to only one user). -
Solutions Table: Contains solutions provided by users for specific problems.
There is a many-to-one relationship with the Problem table (one problem can have multiple solutions, but each solution is associated with only one problem). Similarly, there is a many-to-one relationship with the User table (one user can submit multiple solutions, but each solution belongs to only one user).
Summary of Relationships
- User ↔ Problem: One-to-Many
- Problem ↔ Tag: Many-to-Many
- Problem ↔ Test Cases: One-to-Many
- Problem ↔ Submissions: One-to-Many
- User ↔ Submissions: One-to-Many
- Problem ↔ Solutions: One-to-Many
- User ↔ Solutions: One-to-Many
ER Diagram
Deployment
The application is deployed on DigitalOcean using Kubernetes, which facilitates the seamless creation of multiple replicas or instances of worker nodes. This capability allows for effective load distribution across several workers, particularly during periods of high demand.
Kubernetes Cluster Configuration
The Kubernetes cluster comprises several key components:
- Client Pod: Handles client requests.
- API Server Pod: Manages API interactions and requests.
- NGINX Ingress Controller Pod: Routes external traffic to the appropriate services within the cluster.
- Worker Pod: Executes the main workload. Additional worker pods can be created dynamically to manage increased workloads efficiently.
Email and Object Storage Services
For email services, the application utilizes AWS Simple Email Service (SES), while AWS S3 is employed for object storage. These services provide reliable and scalable solutions for handling email communications and data storage.
Enhanced Security Measures
To maximize security, Cloudflare is implemented as a forward proxy. This setup helps in protecting the application from various threats and enhances performance by caching content and reducing latency.
This architecture effectively leverages Kubernetes' capabilities for scalability and resilience while integrating robust cloud services for email and storage, along with security enhancements through Cloudflare.