A transitory image sequence is one in which no scene element is visible through the entire sequence. When a camera system scans a scene which cannot be covered by a single view, the image sequence is transitory. This project deals with some major theoretical and algorithmic issues associated with the task of estimating structure and motion from transitory image sequences. It is shown that integration with a transitory sequence has properties that are very different from those with a non-transitory one. Two representations, world-centered (WC) and camera-centered (CC), behave very differently with a transitory sequence. The asymptotical error rates derived in this article indicate that one representation is significantly superior to the other, depending on whether one needs camera-centered or world-centered estimates. Using Cramer-Rao lower error bound, the paper also shows that these error rates are not only the rates obtained by the proposed algorithm, but also the best rates possible. Based on the error rate analysis, we introduce an efficient ``cross-frame'' estimation technique for the CC representation. For the WC representation, our analysis indicates that a good technique should be based on camera global pose instead of interframe motions. In addition to testing with synthetic data, rigorous experiments were conducted with real-image sequences taken by a fully calibrated camera system. The comparison of the experimental results with the ground truth has demonstrated that a good accuracy can be obtained from transitory image sequences.