Perception of external events often depends on integrating different sensory information. Many studies show strong evidence for visual-tactile integrations. Understanding how visual and tactile information are merged together is still a challenging problem. Here, a neural network model was used to investigate the mechanisms underlying visual-tactile interactions. It includes two unimodal areas (visual and tactile, respectively), sending feedforward connections into a downstream bimodal area. The unimodal areas influence each other via two synaptic mechanisms: feedback synapses from the bimodal area and direct reciprocal connections. The network reproduces a variety of visual-tactile interactions: 1) detection of faint tactile stimuli is facilitated by concomitant visual input; 2) tactile spatial resolution is improved by visual information; 3) cross-modal advantages are maximum when poor unisensory information is available (inverse effectiveness); and 4) conflict situations are resolved based on the more reliable sensory cue. The model identifies distinct roles for the feedback and direct synapses: the first are fundamental to improve detection of low-intensity tactile stimuli in cross-modal stimulation, and the second are mostly implicated in visual enhancement of tactile spatial localization and resolution. A better comprehension of how vision and touch interact in the neural system may contribute to physiological knowledge, clinical practice, and technological applications.