The performance capabilities of object detection processes have been greatly improved due to the development of deep learning methods. As the performance of object detection methods improves, studies of problems that remained unsolved are now becoming more common. In CCTV technology, such as tracking technology, it has become easier to resolve the matching issue as the performance of object detection methods has improved. A network such as YOLOv3, a single stage multi scale based object detection method, robustly detects objects of various sizes while maintaining real-time performance. Object detection methods for multi scale structures are associated with the problem of an imbalance between a positive box and a negative box on each feature scale. In the CCTV environment, the object detection performance can be degraded due to this ‘unbalance’ problem because the number of objects corresponding to the positive box is relatively small. The learning time is also important because re-training is required for new environments that are constantly being added. In order to solve this problem, we propose a method that solves the unbalance problem through multi scale hard negative mining and that improves the object detection performance while also reducing the learning time.