Frequently Asked Questions

Q1: What is AICodingGym?

A: AICodingGym is a free challenge platform designed to assess and train software engineers' AI-assisted programming skills. Built by researchers at UC San Diego, it provides a gym environment where users tackle coding challenges that today's AI tools still cannot solve on their own. The goal is to help developers use AI coding tools more effectively while benchmarking the performance of those tools.

Q2: What types of challenges are available?

A: We offer three challenge types: Human-SWE-Bench (real bug fixes from open-source GitHub projects), AI/ML competitions (Kaggle-style data science tasks), and Code Review challenges (reviewing real pull request diffs). Each type tests a different aspect of AI-assisted programming.

Q3: Can I use AI tools to solve challenges?

A: Yes! AICodingGym is specifically designed to assess and train your AI-assisted programming skills. You are encouraged to use AI coding tools like Copilot, ChatGPT, Claude, or any other assistant. The challenges are curated to be at the edge of what AI can solve alone, so your skill in collaborating with AI is what matters.

Q4: How do I sign up and track my progress?

A: Click the "Login with GitHub" button in the top-right corner to authenticate with your GitHub account. Once logged in, your progress is automatically tracked as you attempt and solve challenges. You can view your solved challenges and stats on your Profile page.

Q5: How are submissions evaluated?

A: SWE-Bench challenges run unit tests against your patch to verify correctness. AI/ML challenges are scored based on your model's performance on a held-out test set, and your percentile rank on the competition leaderboard determines your score. Code Review challenges check whether you correctly identified the issues in the code.

Q6: How are leaderboard scores calculated?

A: Each solved challenge earns points based on its difficulty: Easy = 1 point, Medium = 5 points, Hard = 10 points, and Expert = 20 points. For AI/ML (MLE-bench) challenges, a multiplier is applied based on your leaderboard percentile: top 10% earns the full score, top 30% earns 40% of the base score, and top 60% earns 20% of the base score. Your total score is the sum of all your solved challenges.

Q7: How do AI/ML challenge percentiles work?

A: For AI/ML (MLE-bench) challenges, after your submission is evaluated, you receive a percentile ranking based on how your solution compares to historical competition entries. A lower percentile is better (top 10% means you outperformed 90% of entries). Your score multiplier depends on which tier you land in: top 10% gets full points, top 30% gets 40%, and top 60% gets 20%.

Q8: How can I contribute or report issues?

A: You can contribute interview problems through the Community page. For bugs, feature requests, or general feedback, use our feedback form below or reach out via our GitHub organization or Discord server (links in the footer).

Didn't find what you're looking for? Have a bug report or suggestion?

Send Us Feedback