Foreground occlusion is a significant challenge in three-dimensional (3-D) reconstruction. In this paper, we first characterize the differences between multiview reconstruction with and without foreground occlusion. In order to reconstruct target scene, we attempt to see through the foreground occlusion using synthetic aperture photography. Different from existing methods, we propose a more generalized model for scene reconstruction, in which the target scene may not be fully observed by any of reference view. Assuming both scene depth and appearance are unknown, we reconstruct 3-D scene from camera array data by selecting optimal views with pixel based clustering. Then, we propose an iterative reconstruction approach in global optimization framework, in which we refine the reconstruction results by applying a coarse-to-fine strategy. Even when all views are partially occluded, our approach can recover accurate depth map as well as scene appearance using camera array data. Experimental results have indicated that the proposed approach is more robust to foreground occlusions and outperforms state-of-the-art approaches.