True label inference from heterogeneous data sources in Natural Language Processing