A Study of Administrative Data Representation for Machine Learning