HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information

# Combining Machine and Automata Learning for Network Traffic Classification

Abstract : Viewing the generated packets of an application as the words of a language, automata learning can be used to derive the behavioral packet-based model of applications. The alphabets of the learned automata, manually defined in terms of packets, may cause overfitting. As some packets always appear together, we apply machine learning techniques to automatically define the alphabet set based on the timing and statistical features of packets. Using the learned automata models, the classifier should detect the accepted words of the models in the input. To improve this time-consuming process, we present a framework, called NeTLang, that identifies the application model in terms of k-testable languages. The classification problem is reduced to observing only $\varTheta (k)$ symbols from the input with the help of machine learning techniques. Our framework utilizes the two diverse automata learning and machine learning techniques to build on their strengths (to be fast and accurate) and to eliminate their weaknesses (i.e., ignoring temporal relations among packets). According to our results, NeTLang outperforms the state-of-the-art methods using each technique alone.
Keywords :
Document type :
Conference papers
Domain :

https://hal.inria.fr/hal-03165385
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Wednesday, March 10, 2021 - 4:05:20 PM
Last modification on : Tuesday, May 25, 2021 - 12:28:02 PM
Long-term archiving on: : Friday, June 11, 2021 - 7:07:08 PM

### File

##### Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-01-01

### Citation

Zeynab Sabahi-Kaviani, Fatemeh Ghassemi, Zahra Alimadadi. Combining Machine and Automata Learning for Network Traffic Classification. 3rd International Conference on Topics in Theoretical Computer Science (TTCS), Jul 2020, Tehran, Iran. pp.17-31, ⟨10.1007/978-3-030-57852-7_2⟩. ⟨hal-03165385⟩

Record views