From 067f42ed846e9920e2c11a332ff634582abd06eb Mon Sep 17 00:00:00 2001 From: Quildra Date: Fri, 1 Aug 2025 17:46:53 +0100 Subject: [PATCH] docs: add comprehensive refactoring task list MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Phase 1: Critical fixes (memory leaks, performance killers) - Phase 2: Architecture refactoring (break up god objects) - Phase 3: Performance optimization (image pipeline, NMS) - Phase 4: Code quality & testing (DI, unit tests, error handling) Total: 16 prioritized tasks with dependencies mapped Ready for branch-per-task development workflow πŸ€– Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- REFACTORING_TASKS.md | 583 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 583 insertions(+) create mode 100644 REFACTORING_TASKS.md diff --git a/REFACTORING_TASKS.md b/REFACTORING_TASKS.md new file mode 100644 index 0000000..08e4cd7 --- /dev/null +++ b/REFACTORING_TASKS.md @@ -0,0 +1,583 @@ +# Pokemon Detection App Refactoring Task List + +This document outlines a systematic approach to addressing critical issues identified in the codebase analysis. Tasks are organized by priority and include clear descriptions, attack plans, and dependencies. + +## 🚨 **Phase 1: Critical Fixes (Immediate - Week 1)** + +### CRITICAL-001: Fix OpenCV Mat Memory Leaks +**Priority**: πŸ”΄ Critical +**Effort**: 2-3 days +**Dependencies**: None + +**Issue Description**: +OpenCV Mat objects are not properly released in exception paths, causing native memory leaks that lead to app crashes during extended use. + +**Files Affected**: +- `YOLOOnnxDetector.kt:400-450` (preprocessImageAtScale method) +- `YOLOOnnxDetector.kt:680-758` (coordinate transformation methods) +- `ScreenCaptureService.kt:450-550` (image processing pipeline) + +**Attack Plan**: +1. **Audit all Mat usage** - Search for `Mat()` instantiations +2. **Implement try-finally pattern**: + ```kotlin + val mat = Mat() + try { + // processing + } finally { + mat.release() + } + ``` +3. **Create Mat resource management utility**: + ```kotlin + inline fun Mat.use(block: (Mat) -> T): T { + try { return block(this) } + finally { this.release() } + } + ``` +4. **Add memory monitoring** in debug builds +5. **Test with 1-hour continuous detection** to verify no leaks + +**Acceptance Criteria**: +- [ ] All Mat objects have guaranteed release paths +- [ ] Memory usage remains stable during extended testing +- [ ] No native memory warnings in logcat + +--- + +### CRITICAL-002: Fix Bitmap Memory Management +**Priority**: πŸ”΄ Critical +**Effort**: 1-2 days +**Dependencies**: None + +**Issue Description**: +Large bitmaps (screen-sized ARGB_8888) are created but not always recycled, especially in exception paths. + +**Files Affected**: +- `ScreenCaptureService.kt:380-420` (bitmap creation from ImageReader) +- `YOLOOnnxDetector.kt:200-250` (bitmap preprocessing) + +**Attack Plan**: +1. **Audit bitmap lifecycle** - Find all `createBitmap()` calls +2. **Implement bitmap recycling**: + ```kotlin + bitmap?.let { + if (!it.isRecycled) it.recycle() + } + ``` +3. **Use bitmap pools** for frequently allocated sizes +4. **Monitor bitmap memory** with Debug.getNativeHeapAllocatedSize() + +**Acceptance Criteria**: +- [ ] All bitmaps properly recycled in all code paths +- [ ] Bitmap memory usage stays under 50MB +- [ ] No OutOfMemoryError exceptions + +--- + +### CRITICAL-003: Disable Parallel Processing Performance Killer +**Priority**: πŸ”΄ Critical +**Effort**: 1 day +**Dependencies**: None + +**Issue Description**: +`ENABLE_MULTI_PREPROCESSING = true` creates more CPU overhead than performance gain due to context switching on mobile CPUs. + +**Files Affected**: +- `YOLOOnnxDetector.kt:300-335` (parallel preprocessing) +- `YOLOOnnxDetector.kt:25` (configuration constant) + +**Attack Plan**: +1. **Benchmark current performance** - measure detection times +2. **Set ENABLE_MULTI_PREPROCESSING = false** +3. **Remove ThreadPoolExecutor creation** in preprocessing +4. **Benchmark improvement** - expect 20-30% faster detection +5. **Remove dead code** - parallel processing methods + +**Acceptance Criteria**: +- [ ] Single-threaded preprocessing only +- [ ] Detection time improved by 20%+ +- [ ] CPU usage reduced during detection + +--- + +### CRITICAL-004: Fix Thread Pool Resource Leaks +**Priority**: πŸ”΄ Critical +**Effort**: 1-2 days +**Dependencies**: None + +**Issue Description**: +Thread pools and executors are created but not properly shut down, causing resource leaks. + +**Files Affected**: +- `ScreenCaptureService.kt:150-200` (ocrExecutor management) +- `YOLOOnnxDetector.kt:300-335` (temporary executors) + +**Attack Plan**: +1. **Audit all Executor usage** - find creation without shutdown +2. **Implement proper lifecycle management**: + ```kotlin + override fun onDestroy() { + ocrExecutor.shutdown() + if (!ocrExecutor.awaitTermination(5, TimeUnit.SECONDS)) { + ocrExecutor.shutdownNow() + } + } + ``` +3. **Use single shared executor** instead of creating multiple +4. **Add executor monitoring** in debug builds + +**Acceptance Criteria**: +- [ ] All executors have proper shutdown lifecycle +- [ ] Thread count remains stable during app lifecycle +- [ ] No executor-related memory leaks + +--- + +## πŸ—οΈ **Phase 2: Architecture Refactoring (Weeks 2-4)** + +### ARCH-001: Extract Screen Capture Manager +**Priority**: 🟠 High +**Effort**: 3-4 days +**Dependencies**: CRITICAL-001, CRITICAL-002 + +**Issue Description**: +ScreenCaptureService has 8+ responsibilities. Screen capture logic should be separated into dedicated manager. + +**Files Affected**: +- `ScreenCaptureService.kt:240-350` (MediaProjection + ImageReader setup) +- Create new: `ScreenCaptureManager.kt` + +**Attack Plan**: +1. **Create ScreenCaptureManager interface**: + ```kotlin + interface ScreenCaptureManager { + fun startCapture(resultData: Intent): Boolean + fun stopCapture() + fun setImageCallback(callback: (Image) -> Unit) + } + ``` +2. **Extract capture logic** from service +3. **Implement dependency injection** for manager +4. **Create unit tests** for isolated capture logic +5. **Update service** to use manager + +**Acceptance Criteria**: +- [ ] ScreenCaptureManager handles only capture logic +- [ ] Service delegates capture operations to manager +- [ ] Capture logic is unit testable +- [ ] No functionality regression + +--- + +### ARCH-002: Extract ML Inference Engine +**Priority**: 🟠 High +**Effort**: 4-5 days +**Dependencies**: CRITICAL-003, ARCH-001 + +**Issue Description**: +ML model management, inference, and preprocessing should be separated from service into dedicated engine. + +**Files Affected**: +- `YOLOOnnxDetector.kt` (consolidate as primary) +- `ScreenCaptureService.kt:500-800` (ML integration logic) +- Create new: `MLInferenceEngine.kt` + +**Attack Plan**: +1. **Create MLInferenceEngine interface**: + ```kotlin + interface MLInferenceEngine { + suspend fun initialize(): Boolean + suspend fun detect(image: Bitmap): List + fun setConfidenceThreshold(threshold: Float) + fun cleanup() + } + ``` +2. **Consolidate YOLO implementations** - keep only OnnxDetector +3. **Remove YOLOTFLiteDetector** and YOLODetector classes +4. **Extract preprocessing** into separate utility +5. **Implement async interface** with coroutines +6. **Add comprehensive error handling** + +**Acceptance Criteria**: +- [ ] Single YOLO implementation (ONNX) +- [ ] Clean async interface with coroutines +- [ ] Preprocessing utilities separated +- [ ] Engine is unit testable +- [ ] All ML logic removed from service + +--- + +### ARCH-003: Extract Pokemon Data Extractor +**Priority**: 🟠 High +**Effort**: 3-4 days +**Dependencies**: ARCH-002 + +**Issue Description**: +OCR processing and Pokemon data extraction logic should be separated from service. + +**Files Affected**: +- `ScreenCaptureService.kt:900-1100` (OCR and text processing) +- Create new: `PokemonDataExtractor.kt` + +**Attack Plan**: +1. **Create PokemonDataExtractor interface**: + ```kotlin + interface PokemonDataExtractor { + suspend fun extractPokemonInfo(detections: List, image: Bitmap): PokemonInfo? + fun configureRegions(screenSize: Size) + } + ``` +2. **Extract OCR logic** from service +3. **Create region-based processing** utility +4. **Implement async OCR pipeline** (remove blocking latch) +5. **Add text parsing utilities** +6. **Create data validation** logic + +**Acceptance Criteria**: +- [ ] OCR logic separated from detection logic +- [ ] Async OCR pipeline (no blocking) +- [ ] Region-based text extraction +- [ ] Data validation and error handling +- [ ] Unit testable extraction logic + +--- + +### ARCH-004: Create Detection Coordinator +**Priority**: 🟑 Medium +**Effort**: 3-4 days +**Dependencies**: ARCH-001, ARCH-002, ARCH-003 + +**Issue Description**: +Need central coordinator to orchestrate screen capture β†’ ML detection β†’ data extraction pipeline. + +**Files Affected**: +- Create new: `DetectionCoordinator.kt` +- Update: `ScreenCaptureService.kt` (reduce to service lifecycle only) + +**Attack Plan**: +1. **Create DetectionCoordinator**: + ```kotlin + class DetectionCoordinator( + private val captureManager: ScreenCaptureManager, + private val inferenceEngine: MLInferenceEngine, + private val dataExtractor: PokemonDataExtractor + ) { + suspend fun performDetection(): DetectionResult + } + ``` +2. **Implement detection pipeline** orchestration +3. **Add error handling** and retry logic +4. **Implement result caching** for performance +5. **Add detection state management** +6. **Create comprehensive tests** + +**Acceptance Criteria**: +- [ ] Clean pipeline orchestration +- [ ] Proper error handling and recovery +- [ ] Result caching implemented +- [ ] Detection state properly managed +- [ ] Service only handles Android service lifecycle + +--- + +## ⚑ **Phase 3: Performance Optimization (Weeks 3-5)** + +### PERF-001: Optimize Image Conversion Pipeline +**Priority**: 🟠 High +**Effort**: 2-3 days +**Dependencies**: ARCH-002 + +**Issue Description**: +Multiple redundant image conversions (Bitmapβ†’Matβ†’Tensor) with unnecessary memory copies. + +**Files Affected**: +- `YOLOOnnxDetector.kt:400-500` (preprocessing pipeline) + +**Attack Plan**: +1. **Audit conversion pipeline** - map all transformations +2. **Eliminate redundant conversions**: + - Direct Bitmapβ†’Tensor when possible + - Cache intermediate formats + - Reuse allocated tensors +3. **Implement zero-copy optimizations** where possible +4. **Add conversion performance monitoring** +5. **Benchmark improvements** + +**Acceptance Criteria**: +- [ ] Minimum number of image conversions +- [ ] 30%+ reduction in preprocessing time +- [ ] Memory allocations reduced +- [ ] Zero-copy optimizations implemented + +--- + +### PERF-002: Implement Efficient NMS Algorithm +**Priority**: 🟑 Medium +**Effort**: 2 days +**Dependencies**: ARCH-002 + +**Issue Description**: +Current NMS implementation is O(nΒ²) - inefficient for large detection sets. + +**Files Affected**: +- `YOLOTFLiteDetector.kt:200-250` (if still used) +- `YOLOOnnxDetector.kt` (NMS implementation) + +**Attack Plan**: +1. **Research efficient NMS algorithms** (sorted box approach) +2. **Implement O(n log n) NMS**: + ```kotlin + fun efficientNMS(detections: List): List { + return detections + .sortedByDescending { it.confidence } + .fold(mutableListOf()) { kept, current -> + if (kept.none { calculateIoU(it.box, current.box) > threshold }) { + kept.add(current) + } + kept + } + } + ``` +3. **Add IoU calculation optimizations** +4. **Benchmark against current implementation** + +**Acceptance Criteria**: +- [ ] O(n log n) NMS implementation +- [ ] 50%+ faster NMS processing +- [ ] IoU calculations optimized +- [ ] Maintains same accuracy + +--- + +### PERF-003: Remove Synchronous OCR Blocking +**Priority**: 🟑 Medium +**Effort**: 2-3 days +**Dependencies**: ARCH-003 + +**Issue Description**: +OCR uses CountDownLatch.await() which blocks the detection pipeline unnecessarily. + +**Files Affected**: +- `ScreenCaptureService.kt:950-1000` (OCR pipeline) + +**Attack Plan**: +1. **Replace CountDownLatch** with coroutines: + ```kotlin + suspend fun extractTextAsync(image: Bitmap): String { + return withContext(Dispatchers.IO) { + // OCR processing + } + } + ``` +2. **Implement timeout handling** with coroutines +3. **Add OCR result caching** for repeated regions +4. **Pipeline parallelization** - OCR while next detection runs + +**Acceptance Criteria**: +- [ ] No blocking OCR operations +- [ ] Coroutine-based async OCR +- [ ] OCR result caching implemented +- [ ] Pipeline parallelization working + +--- + +## 🧹 **Phase 4: Code Quality & Testing (Weeks 4-6)** + +### QUALITY-001: Remove Magic Numbers and Hard-coded Values +**Priority**: 🟑 Medium +**Effort**: 2-3 days +**Dependencies**: None (can run parallel) + +**Issue Description**: +Numerous magic numbers and hard-coded values throughout codebase make it difficult to configure and maintain. + +**Files Affected**: +- All major classes (50+ magic numbers identified) + +**Attack Plan**: +1. **Create configuration data classes**: + ```kotlin + data class DetectionConfig( + val confidenceThreshold: Float = 0.55f, + val inputSize: Int = 640, + val maxInferenceTimeMs: Long = 4500L, + val captureIntervalMs: Long = 2000L + ) + ``` +2. **Extract UI configuration** (FAB sizes, animation durations) +3. **Create build-time configuration** for model paths +4. **Add runtime configuration** validation +5. **Update all usage sites** + +**Acceptance Criteria**: +- [ ] All magic numbers extracted to configuration +- [ ] Configuration validation implemented +- [ ] Easy to modify behavior without code changes +- [ ] Build-time and runtime configuration separated + +--- + +### QUALITY-002: Implement Comprehensive Error Handling +**Priority**: 🟠 High +**Effort**: 3-4 days +**Dependencies**: ARCH-001, ARCH-002, ARCH-003 + +**Issue Description**: +"Silent failure" anti-pattern appears 50+ times - exceptions are swallowed without proper handling. + +**Files Affected**: +- All major classes (widespread issue) + +**Attack Plan**: +1. **Define error handling strategy**: + ```kotlin + sealed class DetectionError { + object ModelNotLoaded : DetectionError() + data class InferenceTimeout(val timeoutMs: Long) : DetectionError() + data class ImageProcessingFailed(val cause: Throwable) : DetectionError() + } + ``` +2. **Replace try-catch-ignore** with proper error handling +3. **Implement error reporting** mechanism +4. **Add user-facing error messages** +5. **Create error recovery strategies** + +**Acceptance Criteria**: +- [ ] No silent exception swallowing +- [ ] Comprehensive error type definitions +- [ ] User-facing error messages +- [ ] Error recovery mechanisms +- [ ] Error logging and reporting + +--- + +### QUALITY-003: Add Unit Test Coverage +**Priority**: 🟑 Medium +**Effort**: 5-7 days +**Dependencies**: ARCH-001, ARCH-002, ARCH-003, ARCH-004 + +**Issue Description**: +Codebase has no unit tests, making refactoring risky and regression detection impossible. + +**Files Affected**: +- Create: `test/` directory structure +- All refactored classes need tests + +**Attack Plan**: +1. **Set up testing infrastructure**: + - JUnit 5, Mockito, Robolectric + - Test data and mock image generation +2. **Create test strategy**: + ```kotlin + // Example test structure + class MLInferenceEngineTest { + @Test fun `detect should return valid detections for test image`() + @Test fun `detect should handle empty image gracefully`() + @Test fun `detect should respect confidence threshold`() + } + ``` +3. **Achieve 80%+ coverage** on core logic classes +4. **Add integration tests** for pipeline +5. **Set up CI test execution** + +**Acceptance Criteria**: +- [ ] 80%+ unit test coverage on core classes +- [ ] Integration tests for detection pipeline +- [ ] Mock-based testing for dependencies +- [ ] CI pipeline runs tests automatically +- [ ] Test data generation utilities + +--- + +### QUALITY-004: Implement Dependency Injection +**Priority**: 🟑 Medium +**Effort**: 3-4 days +**Dependencies**: ARCH-001, ARCH-002, ARCH-003, ARCH-004 + +**Issue Description**: +Hard dependencies throughout codebase make testing difficult and coupling tight. + +**Files Affected**: +- All refactored classes +- Add: Hilt/Dagger dependencies + +**Attack Plan**: +1. **Add Hilt dependency injection**: + ```kotlin + @Module + @InstallIn(SingletonComponent::class) + object DetectionModule { + @Provides + fun provideMLInferenceEngine(): MLInferenceEngine = YOLOOnnxEngine() + } + ``` +2. **Create injection interfaces** for all major components +3. **Update constructors** to accept dependencies +4. **Create test modules** for mocking +5. **Add scoping annotations** for lifecycle management + +**Acceptance Criteria**: +- [ ] Hilt DI implemented throughout +- [ ] No direct dependency instantiation +- [ ] Test modules for easy mocking +- [ ] Proper dependency scoping +- [ ] Clean separation of concerns + +--- + +## πŸ“‹ **Task Dependencies Visualization** + +``` +Phase 1 (Critical Fixes): +CRITICAL-001 ────┐ +CRITICAL-002 ────┼─→ ARCH-001 ──┐ +CRITICAL-003 β”€β”€β”€β”€β”˜ β”‚ +CRITICAL-004 ────────────────────┼─→ ARCH-002 ──┐ + β”‚ β”‚ +Phase 2 (Architecture): β”‚ β”œβ”€β†’ ARCH-004 + CRITICAL-003 β”€β”˜ β”‚ + β”‚ + ARCH-002 ────┴─→ ARCH-003 + β”‚ +Phase 3 (Performance): β”‚ +PERF-001 ←─────────────────── ARCH-002 β”‚ +PERF-002 ←─────────────────── ARCH-002 β”‚ +PERF-003 ←─────────────────── ARCH-003 β†β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + +Phase 4 (Quality): +QUALITY-001 (can run parallel) +QUALITY-002 ←─ All ARCH tasks +QUALITY-003 ←─ All ARCH tasks +QUALITY-004 ←─ All ARCH tasks +``` + +## 🎯 **Success Metrics Per Phase** + +### Phase 1 Metrics: +- [ ] Memory usage stable < 100MB during 1-hour test +- [ ] Detection time improved by 20%+ +- [ ] Zero native memory leaks +- [ ] No executor resource leaks + +### Phase 2 Metrics: +- [ ] No classes > 300 lines +- [ ] Clear separation of concerns +- [ ] Service class < 200 lines +- [ ] All components unit testable + +### Phase 3 Metrics: +- [ ] Detection pipeline < 2 seconds +- [ ] Image preprocessing 30% faster +- [ ] NMS processing 50% faster +- [ ] No blocking operations in pipeline + +### Phase 4 Metrics: +- [ ] 80%+ test coverage on core logic +- [ ] Zero magic numbers in business logic +- [ ] Comprehensive error handling +- [ ] Full dependency injection + +--- + +**Next Steps**: Choose which task to start with. I recommend beginning with **CRITICAL-001 (OpenCV Mat Memory Leaks)** as it has no dependencies and addresses the most serious stability issue. \ No newline at end of file