Scalable Alignment of Large Language Models Towards Truth Seeking, Complex Reasoning, and Human Values