Canonical sectors and evolution of US stocks: an application of machine learning in Python

01:30 PM - 02:00 PM on August 16, 2014, Room 705

RICKY CHACHRA

Audience level:: intermediate
Watch:: http://youtu.be/g67dp1V1TIk

Description

An unsupervised machine learning algorithm to exploit the underlying data structure in historical stock market returns shows promising classification results with implications for macroeconomic analysis and for creating financial indices.

Abstract

A classification of companies into sectors of the economy is important for macroeconomic analysis and for investments into the sector-specific financial indices or exchange traded funds (ETFs). Major industrial classification systems and financial indices are developed essentially manually by relying on expert opinion and stock-picking. Here we show how a broad-level sector decomposition of the stocks can be made more objectively and comprehensively via unsupervised machine learning. An emergent low-dimensional structure in the space of historical stock-price returns makes it possible to automatically identify emergent “canonical sectors” in the market and to assign every stock a participation weight into each sector. Furthermore, by analyzing data from different periods at a time, we show how firms listed in the market have evolved in their decomposition into sectors.