Zhang, Zhihan2018-11-152018-11-152018-12-01http://hdl.handle.net/2097/39297The amount of data generated from stock market trading is massive. For example, roughly 10 million trades are performed each day on the NASDAQ stock exchange. A significant proportion of these trades are made by high-frequency traders. These entities make on the order of thousands or more trades a day. However, the stock-market factors that drive the decisions of high-frequency traders are poorly understood. Recently, hybridized threshold clustering (HTC) has been proposed as a way of clustering large-to-massive datasets. In this report, we use three months of NASDAQ HFT data---a dataset containing information on all trades of 120 different stocks including identifiers on whether the buyer and/or seller were high-frequency traders---to investigate the trading patterns of high-frequency traders, and we explore the use of HTC to identify these patterns. We find that, while HTC can be successfully performed on the NASDAQ HFT dataset, the amount of information gleaned from this clustering is limited. Instead, we show that an understanding of the habits of high-frequency traders may be gained by looking at \textit{janky} trades---those in which the number of shares traded is not a multiple of 10. We demonstrate evidence that janky trades are more common for high-frequency traders. Additionally, we suggest that a large number of small, janky trades may help signal that a large trade will happen shortly afterward.en-USClusteringStock market datasetHigh-frequency tradesSelected results from clustering and analyzing stock market trade dataReport