82 / 100

Repository Evaluation

hmmhmmhm/daiso-mcp

The `CLAUDE.md` and `AGENTS.md` files provide a highly professional, structured set of guidelines that significantly improve developer/agent alignment. Its greatest strengths are its clear, testable constraints regarding file size and test coverage, and its explicit focus on security and architectural patterns.

v2 2 files evaluated 4 days ago

Files Considered

agents AGENTS.md 17,534 bytes

claude CLAUDE.md 17,534 bytes

linked docs/ai-instruction.md 13,352 bytes

linked docs/cgv-network-analysis-result.md 7,204 bytes

linked docs/cu-app-request-capture-guide.md 3,197 bytes

linked docs/cu-app-scraping-replay-guide.md 3,927 bytes

linked docs/cu-network-analysis-result.md 3,717 bytes

linked docs/daiso-network-analysis-result.md 12,072 bytes

linked docs/daiso-playwright-network-analysis.md 5,885 bytes

linked docs/emart24-app-scraping-preparation-guide.md 4,560 bytes

linked docs/emart24-app-scraping-replay-guide.md 4,253 bytes

linked docs/emart24-network-analysis-result.md 5,607 bytes

linked docs/gs25-final-replay-methodology.md 9,970 bytes

linked docs/gs25-frida-plaintext-capture-guide.md 4,526 bytes

linked docs/lottecinema-network-analysis-result.md 13,766 bytes

linked docs/lottemart-mobile-scraping-replay-plan.md 10,202 bytes

linked docs/megabox-network-analysis-result.md 5,338 bytes

linked docs/mitmproxy-guide.md 4,568 bytes

linked docs/oliveyoung-lightpanda-validation.md 2,264 bytes

linked docs/oliveyoung-network-analysis-result.md 9,001 bytes

linked docs/oliveyoung-playwright-mcp-onboarding.md 3,252 bytes

linked README.md 17,407 bytes

Full Evaluation

Summary

The CLAUDE.md and AGENTS.md files provide a highly professional, structured set of guidelines that significantly improve developer/agent alignment. Its greatest strengths are its clear, testable constraints regarding file size and test coverage, and its explicit focus on security and architectural patterns.

Scores

Dimension	Score	Justification
Specificity & Procedures	9/10	Provides precise steps for adding services: "1. 서비스 디렉토리 생성 2. ServiceProvider 구현 3. 레지스트리에 등록."
Measurable Constraints	9/10	Includes specific numeric thresholds: "모든 코드 파일은 450줄 내외", "npm run test:coverage가 100%를 만족".
Commands & Tools	7/10	Defines tools like `wrangler` and specifies naming conventions, though could be more restrictive about shell usage.
Structure & Layering	8/10	Logical grouping of policies from coding style to security and architecture. Easy to navigate.
Boundaries & Escalation	6/10	Good on "don'ts" regarding secrets, but lacks explicit instructions on when/how to ask the human for help.
Context & Motivation	7/10	Good context on the "Plugin-based architecture." Reasons are clear (e.g., "확장 가능한 플러그인 아키텍처").
Anti-Rationalization	4/10	Lacks explicit naming of model "shortcuts" or "hallucination patterns" (e.g., "Don't assume code works just because it compiles").
Signal-to-Noise	9/10	Almost zero fluff; nearly every section provides actionable instructions or essential project constraints.

Issues Found

Critical

None.

Major

Lack of Anti-Rationalization: The file assumes the AI will follow instructions perfectly if they are written. It fails to warn the model about common AI pitfalls (e.g., "Don't hallucinate non-existent API endpoints," "Don't infer test success from syntax correctness").
Escalation Protocol: There is no defined process for when the model hits a blocker, encounters an ambiguous requirement, or suspects a security vulnerability.

Minor

Redundancy: The repository contains both AGENTS.md and CLAUDE.md with identical content, which creates a maintenance burden. Choose one (usually CLAUDE.md for Claude-specific tools).
Hard-coded formatting: Some formatting examples (e.g., "2 spaces") are excellent, but could be reinforced by suggesting a .editorconfig or prettier file to ensure the agent doesn't just "try" to follow, but "enforces" it via tools.

Recommended Rewrites

Anti-Rationalization (Add this section to improve reliability)

Original: (Missing)
Proposed:

AI 행동 및 오해 방지 규칙

검증 우선: "코드가 빌드됨"을 추측하지 마세요. 반드시 터미널에서 컴파일 명령을 실행하여 성공을 확인하세요.

비관적 가정: 라이브러리나 API의 존재 여부를 짐작하지 마세요. 존재하지 않는 모듈을 호출할 경우, 코드를 생성하기 전에 반드시 ls나 grep으로 확인하세요.

안전 제일: 해결책이 불확실하거나 보안상 의심되는 경우, 코드를 생성하지 말고 사용자에게 질문하세요. "추측성 코드"는 절대 리포지토리에 반영하지 마세요.

Action Plan

Anti-Rationalization: Add an "AI Behavioral Guardrails" section to explicitly warn against model shortcuts (hallucination, assuming success, guessing dependencies).
Maintenance: Delete one of the redundant files (AGENTS.md or CLAUDE.md) to ensure there is a single source of truth.
Escalation: Add an "Escalation Policy" section defining exactly when the AI must stop execution and prompt the human (e.g., "If npm test fails for > 10 minutes without resolution, stop and summarize findings").

## Summary

The `CLAUDE.md` and `AGENTS.md` files provide a highly professional, structured set of guidelines that significantly improve developer/agent alignment. Its greatest strengths are its clear, testable constraints regarding file size and test coverage, and its explicit focus on security and architectural patterns.

## Scores

| Dimension | Score | Justification |
|-----------|-------|---------------|
| Specificity & Procedures | 9 | Provides precise steps for adding services: "1. 서비스 디렉토리 생성 2. ServiceProvider 구현 3. 레지스트리에 등록." |
| Measurable Constraints | 9 | Includes specific numeric thresholds: "모든 코드 파일은 450줄 내외", "npm run test:coverage가 100%를 만족". |
| Commands & Tools | 7 | Defines tools like `wrangler` and specifies naming conventions, though could be more restrictive about shell usage. |
| Structure & Layering | 8 | Logical grouping of policies from coding style to security and architecture. Easy to navigate. |
| Boundaries & Escalation | 6 | Good on "don'ts" regarding secrets, but lacks explicit instructions on when/how to ask the human for help. |
| Context & Motivation | 7 | Good context on the "Plugin-based architecture." Reasons are clear (e.g., "확장 가능한 플러그인 아키텍처"). |
| Anti-Rationalization | 4 | Lacks explicit naming of model "shortcuts" or "hallucination patterns" (e.g., "Don't assume code works just because it compiles"). |
| Signal-to-Noise | 9 | Almost zero fluff; nearly every section provides actionable instructions or essential project constraints. |

**Overall effectiveness**: 82

## Issues Found

### Critical
- None.

### Major
- **Lack of Anti-Rationalization**: The file assumes the AI will follow instructions perfectly if they are written. It fails to warn the model about common AI pitfalls (e.g., "Don't hallucinate non-existent API endpoints," "Don't infer test success from syntax correctness").
- **Escalation Protocol**: There is no defined process for when the model hits a blocker, encounters an ambiguous requirement, or suspects a security vulnerability.

### Minor
- **Redundancy**: The repository contains both `AGENTS.md` and `CLAUDE.md` with identical content, which creates a maintenance burden. Choose one (usually `CLAUDE.md` for Claude-specific tools).
- **Hard-coded formatting**: Some formatting examples (e.g., "2 spaces") are excellent, but could be reinforced by suggesting a `.editorconfig` or `prettier` file to ensure the agent doesn't just "try" to follow, but "enforces" it via tools.

## Recommended Rewrites

### Anti-Rationalization (Add this section to improve reliability)
**Original:** (Missing)
**Proposed:**
> ### AI 행동 및 오해 방지 규칙
> 1. **검증 우선**: "코드가 빌드됨"을 추측하지 마세요. 반드시 터미널에서 컴파일 명령을 실행하여 성공을 확인하세요.
> 2. **비관적 가정**: 라이브러리나 API의 존재 여부를 짐작하지 마세요. 존재하지 않는 모듈을 호출할 경우, 코드를 생성하기 전에 반드시 `ls`나 `grep`으로 확인하세요.
> 3. **안전 제일**: 해결책이 불확실하거나 보안상 의심되는 경우, 코드를 생성하지 말고 사용자에게 질문하세요. "추측성 코드"는 절대 리포지토리에 반영하지 마세요.

## Action Plan

1. **Anti-Rationalization**: Add an "AI Behavioral Guardrails" section to explicitly warn against model shortcuts (hallucination, assuming success, guessing dependencies).
2. **Maintenance**: Delete one of the redundant files (`AGENTS.md` or `CLAUDE.md`) to ensure there is a single source of truth.
3. **Escalation**: Add an "Escalation Policy" section defining exactly when the AI must stop execution and prompt the human (e.g., "If `npm test` fails for > 10 minutes without resolution, stop and summarize findings").