Learning Transformations From Video