Typing Behavior based Continuous User Authentication

We hypothesize that an individual computer user has a unique and consistent habitual pattern of Typing Behavior, independent of the text, while typing on a keyboard. This habit manifests itself visually through the shapes and motions of the hands, as well as audibly through the sounds produced when pressing and releasing keys. Given a webcam pointing towards a keyboard, we develop real-time computer vision and acoustic algorithms to automatically extract the habitual patterns from the video and audio streams and continuously verify the identity of the active computer user.

Unlike conventional authentication schemes, continuous authentication has a number of advantages, such as longer time for sensing, ability to rectify authentication decisions, and persistent verification of a user's identity, which are critical in applications demanding enhanced security. Traditional biometric modalities such as face and fingerprint have various drawbacks when used in continuous authentication scenarios such as privacy concerns and interference with normal computer operation. We propose typing behavior as a non-intrusive privacy-aware biometric modality that utilizes standard interactions with the keyboard peripheral.

Visual Typing Behavior

To capture the unique and consistent hand movements from typing, we use a simple webcam pointed at the keyboard. Given the video, the proposed system segments the hands from the background and separates the left and right hand. A shape context based on the silhouette is extracted from each hand, which is combined with the hand position relative to the keyboard. We also propose a novel extension extension of Bag of Phrases, Bag of Multi-dimensional Phrases, where a probe video finds corresponding hand shapes across the temporal domain, independently for each hand.

Concept figure of visual typing-based authentication

Acoustic Typing Behavior

Given the keyboard sound recorded by the webcam, our system extracts discriminative features and performs matching between a gallery and a probe sound stream. Motivated by the concept of digraphs used in modeling keystroke dynamics, we learn a virtual alphabet from keystroke sound segments, from which the digraph latency within pairs of virtual letters as well as other statistical features are used to generate match scores. The resultant multiple scores are indicative of the similarities between two sound streams, and are fused to make a final authentication decision.

Basic concept of biometric authentication via keystroke sound

Typing Behavior Dataset

We collect a first-of-its kind keystroke database in two phases. Phase 1 includes 51 subjects typing multiple same day, fixed and free text, sessions. It includes the acoustics and video information. Phase 2 includes 30 subjects typing mutliple free text sessions on different days across months. It includes the video information as well as keystroke timing information for use with conventional keystroke dynamics.

To obtain a copy of this dataset, please email Joseph Roth at rothjos1[at]msu[dot]edu with the subject: "MSU Typing Behavior Database download".


Any publications arising from the use of this database, including but not limited to academic journal and conference publications, technical reports and manuals, must cite the following work:

  • Joseph Roth, Xiaoming Liu, Arun Ross, and Dimitris Metaxas, "Investigating the Discriminative Power of Keystroke Sound," IEEE Transactions on Information Forensics and Security, Vol. 10, No. 2, pp. 333-345, Feb. 2015. PDF
  • Joseph Roth, Xiaoming Liu, and Dimitris Metaxas, "On Continuous User Authentication via Typing Behavior," IEEE Transactions on Image Processing, Vol. 23, No. 10, pp. 4611-4624, Oct. 2014. PDF
  • Joseph Roth, Xiaoming Liu, Arun Ross, and Dimitris Metaxas, "Biometric Authentication via Keystroke Sound," in Proceedings of the 6th IAPR International Conference on Biometrics (ICB) 2013, Madrid, Spain, June 4-7, 2013. PDF

Back To Top