4MMSR-2012-2013-project-SSL encrypted traffic model learning
Classification and Inference of SSL Encrypted Traffic
TLS/SSL was built to ensure the confidentiality and integrity security properties of its payload. Maciej Korczyński research work showed it is possible to classify such a traffic without decrypting it. This is achieved by learning models of SSL-enabled services and comparing the current trace with the learned models. In this student research project, we are interested in observing the impacts of client SSL libraries on the learned models.
- slides detailling a high level of the approach
- generate example traces with payload from TLS enabled applications and dump them to pcap files
- parse/dissect those pcap and create a first-order markov chain from client to server side (Client->Server): only use the TLS message type (e.g. 22:1) for the states
- classify a given additional trace (not used in the learning phase) according to the learned models
- comparison of learned models for characterizing a given website when varying (only change one parameter at a time):
- configuration of a given SSL library (e.g., accepted ciphers, ciphers priority, ...)
- used client-side SSL library
- operating system
- reflection on parameters to add to the model when willing to characterize the website that has been reached and capture from an intermediate TOR node (nor the first, nor the last in a path)
- use additional parameters to infer Client->Server models (e.g. certificate information, timestamp difference, session ID)
- identify parameters in traces to track users at a router level (e.g., SSL encrypted flows generated behind a NAT cannot be easily associated with a particular user)
- final slides summarizing your work (!!! will not be public before Maciej paper publication !!!)
- For a given SSL enabled service, how does a "generalized" model (of the server side when varying the clients) perform against a "simple" server model (when only 1 client)?
- Let two applications A and B using the very same SSL library with the very same version, are the learned models different? Can we see a pattern emerging when comparing several of those?
- Can we learn a model of TOR clients? What are the problems? Propose modifications to the model inference approach to overcome those limitations, and provide the intuition behind each suggestion.
- Classifying Service Flows in the Encrypted Skype Trafﬁc Maciej Korczyński and Andrzej Duda, ICC 2012
- SSL/TLS: état des lieux et recommandations, Olivier Levillain, SSTIC 2012 (slides)
- Classifying TLS/SSL Encrypted Application Flows, Maciej Korczyński and Andrzej Duda, 2013 (to be submitted, !!! DO NO DISTRIBUTE: will not be public before Maciej paper publication !!!)
- First-Order Markov Chains
- SSL 3.0 RFC
- wireshark, tshark tcpdump
- wireshark dissectors:
- when playing by hand, to understand: ssldump
- create an archive on the ensimag server, so that only your team members and I have access to it.
- obviously, do not forget to send me the path afterwards
Fabien Duchene & maciej.korczynski !!__at__!! gmail.com