DGIST Scholar: Missing Ciphertext Recovery and Classification via Deep Learning

Department of Robotics and Mechatronics Engineering Theses Master

Cited time in webofscience

Cited time in scopus

Missing Ciphertext Recovery and Classification via Deep Learning

Title: Missing Ciphertext Recovery and Classification via Deep Learning

Alternative Title: 딥러닝을 통한 누락된 암호문 복구 및 분류

Author(s): Namki Kim

DGIST Authors: Namki Kim ; Inkyu Moon ; Goorak Kwon

Advisor: 문인규

Co-Advisor(s): Goorak Kwon

Issued Date: 2023

Awarded Date: 2023-02-01

Type: Thesis

Description: Natural language process, deep learning, Sequence to Sequence, Ciphertext Recovery, Ciphertext classification

Abstract: Missing text recovery and classification is one of the major concerns in many fields like cryptanalysis and deep learning, especially natural language processing(NLP). The cryptogram might be missing some sensitive information due to problems such as text degradation while communicating between communica-tion devices due to the complex communication environment or encryption schemes. Predicting the original data and category in missing information is a challenging task but has not received poorly attention. In this paper, we proposed a long short-term memory (LSTM) encoder-decoder network with an attention mecha-nism to predict randomly missing characters of 5%, 10%, and 15% of data encrypted with different classical cipher mechanisms. The suggested method takes ciphertexts with missing characters and predicts the original ciphertext as a result. The proposed model uses the LSTM model to learn the long-term dependence of the word sequence related to certain missing characters. To make long-term projections for the missing character sequence, the attention mechanism calculates the contextual attention score to the most relevant information from words jointly and passes to the decoder part. The proposed method obtained a recovery rate of up to 93% for missing characters and achieved a significantly higher recovery rate compared to the encoder-decoder model consisting of only LSTM, GRU without the conventional Attention mechanism. Besides training and testing were conducted on the prediction of the plaintext from the missing ciphertext. As a re-sult, a plaintext prediction rate of up to 91% was achieved. In addition, we presented a network model for ciphertext classification with missing characters based on deep bidirectional LSTM(BLSTM) and gated re-current units(GRU) to identify the missing ciphertext category.. The performance of the model was tested using widely used evaluation measures. The various experimental analysis shows that the BLSTM-GRU net-work model achieved a high classification accuracy of up to 95% and better classification accuracy compar-ison to a traditional network using only conventional LSTM and GRU, BLSTM models.; 본 논문은 딥러닝 기반 방법을 이용한 누락된 암호문의 복구와 분류에 대해 다루고 있습니다.
암호는 복잡한 통신환경과 암호화 방법의 복잡성과 낮은 품질로 인해 통신 장비들 간의 비밀 통신 간에 일부 민감한 정보가 누락될 수 있습니다.
그리고 딥러닝 기반 방법은 의료 분야, 암호 분석, 번역과 같은 다양한 분야에서 연구가 진행되고 있으며 인상적인 결과를 얻고 있습니다.
본 연구에서는 세 가지 고전 암호화 방법으로 암호화한 암호문을 랜덤으로 5%,10%,15% 길이의 문자를 누락시킨 암호문을 대상으로 딥러닝 기반 기술을 적용하여 연구를 진행하였습니다.
첫번째 연구는 누락된 암호문의 누락된 문자를 복구하는 연구로 Attention 기술을 갖춘 LSTM 모델을 베이스로 한 Encoder-Decoder 모델을 이용하여 연구를 진행하였습니다. 본 연구에서는 최대 93%의 누락된 문자 복구율을 결과로 얻을 수 있었습니다. 그리고 제안한 예측 모델과 성능을 비교하기 위해 Attention 기술을 갖추지 않은 LSTM 모델을 베이스로 한 Encoder-Decoder 모델 및 GRU 모델을 베이스로 한 Encoder-Decoder 모델과 복구율을 비교한 결과, 제안된 예측 모델이 Attention 기술을 갖추지 않은 LSTM, GRU 모델을 베이스로 한 Encoder-Decoder 모델에 비해 매우 높은 복구율을 달성했습니다. 그 외에도, 추가적으로 제안된 예측 모델을 이용하여 누락된 암호문에서 원문을 예측하는 실험을 진행한 결과, 최대 91%의 원문 예측율을 달성하였습니다.
두번째 연구는 누락된 암호문의 범주를 분류하는 연구로 BLSTM, GRU, LSTM 모델을 이용한 네트워크를 이용하여 연구를 진행하였습니다. 최대 95%의 높은 분류 정확도의 결과를 얻었고 기존의 LSTM과 GRU, BLSTM 모델만을 이용한 네트워크에 비해 더 나은 분류 정확도를 달성했습니다.

Table Of Contents: Ⅰ. INTRODUCTION 1
Ⅱ. METHODOLOGY 4
2.1 CLASSICAL CIPHERS 4
2.2 RECURRENT NEURAL NETWORKS (RNN) 5
2.3 LONG SHORT-TERM MEMORY (LSTM) 5
2.4 LSTM ENCODER-DECODER MODEL 7
Ⅲ. PROPOSED APPROACH 9
3.1 ATTENTION-BASED LSTM ENCODER-DECODER 9
3.2 WORD EMBEDDING 11
Ⅳ. RESULTS AND DISCUSSION 13
4.1 MODEL TRAINING 15
4.2 PERFORMANCE ANALYSIS 20
Ⅴ. CONCLUSIONS 24
Reference 27
국문 요약 31

URI: http://hdl.handle.net/20.500.11750/45785

http://dgist.dcollection.net/common/orgView/200000656454

DOI: 10.22677/THESIS.200000656454

Degree: Master

Department: Department of Robotics and Mechatronics Engineering

Publisher: DGIST

Related Researcher

Moon, Inkyu
Research Interests 지능형 영상시스템; AI기반 영상분석; AI기반 암호시스템; Intelligent Imaging Systems; AI-based Image Analysis; AI-based Cryptosystems & Cryptanalysis

Files in This Item:: There are no files associated with this item.

Appears in Collections:: Department of Robotics and Mechatronics Engineering Theses Master

Show Full Item Record

qrcode

DGIST

DGIST Scholar was built with support from the OAK distribution project by the National Library of Korea.

You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Library Services Team, DGIST 333. Techno Jungang-daero, Hyeonpung-myeon, Dalseong-gun, Daegu, 42988, Republic of Korea.

DGIST Library Repository

BROWSE

DGIST

BROWSE