Detecting Saliency by Combining Speech and Object Detection in Indoor Environments